[jira] [Closed] (YUNIKORN-3026) Resource overcommit

2025-02-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit closed YUNIKORN-3026.
--

> Resource overcommit
> ---
>
> Key: YUNIKORN-3026
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3026
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Rafał Boniecki
>Priority: Major
>
> Provide implementation of resource (cpu, memory) overcommit.
> Kubernetes requests/limits model promotes waste of a lot of resources. It's 
> impossible to correctly (without wasting resources or resource starvation) 
> set static resources at pod creation. It would be useful for scheduler to 
> have an ability to ignore these requests/limits and schedule by real load 
> instead up to some configured soft limit and when hard limit is hit (eg 60% 
> of real cpu usage and/or 40% of real memory usage) it should have ability to 
> deschedule some pods (configurable which ones eg by presence of annotation) 
> using some kind of algorithm (eg random/highest resource used/lowest resource 
> used, ideally in combination with priority of the pod and or it's 
> labels/annotations).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3026) Resource overcommit

2025-02-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3026.

Resolution: Won't Do

> Resource overcommit
> ---
>
> Key: YUNIKORN-3026
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3026
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Rafał Boniecki
>Priority: Major
>
> Provide implementation of resource (cpu, memory) overcommit.
> Kubernetes requests/limits model promotes waste of a lot of resources. It's 
> impossible to correctly (without wasting resources or resource starvation) 
> set static resources at pod creation. It would be useful for scheduler to 
> have an ability to ignore these requests/limits and schedule by real load 
> instead up to some configured soft limit and when hard limit is hit (eg 60% 
> of real cpu usage and/or 40% of real memory usage) it should have ability to 
> deschedule some pods (configurable which ones eg by presence of annotation) 
> using some kind of algorithm (eg random/highest resource used/lowest resource 
> used, ideally in combination with priority of the pod and or it's 
> labels/annotations).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2996) Update build dependency to kubernetes v1.32.x

2025-02-10 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2996:
---
Summary: Update build dependency to kubernetes v1.32.x  (was: Update build 
dependency to kubernetes v1.32.0)

> Update build dependency to kubernetes v1.32.x
> -
>
> Key: YUNIKORN-2996
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2996
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> Kubernetes v1.32.0 is out, and has several scheduler-related changes. We 
> should plan to update our build dependency to match.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3024) "make lint" cannot be completed on macos/darwin/arm64

2025-02-07 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3024:
---
Priority: Minor  (was: Critical)

> "make lint" cannot be completed on macos/darwin/arm64
> -
>
> Key: YUNIKORN-3024
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3024
> Project: Apache YuniKorn
>  Issue Type: Improvement
>Reporter: Kaichia Chen
>Assignee: Kaichia Chen
>Priority: Minor
> Attachments: image-2025-02-07-22-51-58-251.png
>
>
> Current golang lint checker tool - golangcli-lint version is 1.57.2 in "make 
> lint" across repository. "make lint" will be stuck if running on 
> macos/darwin/arm64
> I tried to downgrade the version to 1.54.2 and it works



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3021) Change default volume binding timeout to 10 minutes

2025-01-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3021.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged PRs to master.

> Change default volume binding timeout to 10 minutes
> ---
>
> Key: YUNIKORN-3021
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3021
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation, shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available, release-notes
> Fix For: 1.7.0
>
>
> The default volume bind timeout has been set at 10 seconds ever since it was 
> first made configurable in YuniKorn 0.8.0. However, the default in Kubernetes 
> is 10 minutes. Given this is a common failure point reported by users, we 
> should update the default to match Kubernetes (10 minutes).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3021) Change default volume binding timeout to 10 minutes

2025-01-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3021:
---
Labels: release-notes  (was: )

> Change default volume binding timeout to 10 minutes
> ---
>
> Key: YUNIKORN-3021
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3021
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation, shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: release-notes
>
> The default volume bind timeout has been set at 10 seconds ever since it was 
> first made configurable in YuniKorn 0.8.0. However, the default in Kubernetes 
> is 10 minutes. Given this is a common point of contention for users, we 
> should update the default to match Kubernetes (10 minutes).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3021) Change default volume binding timeout to 10 minutes

2025-01-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3021:
---
Description: The default volume bind timeout has been set at 10 seconds 
ever since it was first made configurable in YuniKorn 0.8.0. However, the 
default in Kubernetes is 10 minutes. Given this is a common failure point 
reported by users, we should update the default to match Kubernetes (10 
minutes).  (was: The default volume bind timeout has been set at 10 seconds 
ever since it was first made configurable in YuniKorn 0.8.0. However, the 
default in Kubernetes is 10 minutes. Given this is a common point of contention 
for users, we should update the default to match Kubernetes (10 minutes).)

> Change default volume binding timeout to 10 minutes
> ---
>
> Key: YUNIKORN-3021
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3021
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation, shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: release-notes
>
> The default volume bind timeout has been set at 10 seconds ever since it was 
> first made configurable in YuniKorn 0.8.0. However, the default in Kubernetes 
> is 10 minutes. Given this is a common failure point reported by users, we 
> should update the default to match Kubernetes (10 minutes).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3021) Change default volume binding timeout to 10 minutes

2025-01-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3021:
---
Component/s: documentation

> Change default volume binding timeout to 10 minutes
> ---
>
> Key: YUNIKORN-3021
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3021
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation, shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> The default volume bind timeout has been set at 10 seconds ever since it was 
> first made configurable in YuniKorn 0.8.0. However, the default in Kubernetes 
> is 10 minutes. Given this is a common point of contention for users, we 
> should update the default to match Kubernetes (10 minutes).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3021) Change default volume binding timeout to 10 minutes

2025-01-29 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3021:
--

 Summary: Change default volume binding timeout to 10 minutes
 Key: YUNIKORN-3021
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3021
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: shim - kubernetes
Reporter: Craig Condit
Assignee: Craig Condit


The default volume bind timeout has been set at 10 seconds ever since it was 
first made configurable in YuniKorn 0.8.0. However, the default in Kubernetes 
is 10 minutes. Given this is a common point of contention for users, we should 
update the default to match Kubernetes (10 minutes).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-3007) Improve YuniKorn reservation logic

2025-01-27 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17921450#comment-17921450
 ] 

Craig Condit commented on YUNIKORN-3007:


I disagree 100% with removing reservations. However, I think there is room for 
some design work on improving how they work and providing ways to customize 
that behavior. I propose we use this Jira issue to have a discussion on that.

> Improve YuniKorn reservation logic
> --
>
> Key: YUNIKORN-3007
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3007
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler
>Reporter: Rainie Li
>Assignee: Rainie Li
>Priority: Major
> Attachments: queue.yaml, test-job1.yaml, test-job2.yaml, 
> test-job3.yaml
>
>
> *Issue and Investigation:*
> We’ve observed spark job slowness issues on our prod cluster, especially when 
> large jobs are running on the cluster. This performance degradation impacts 
> user experience.
> When High cluster utilization with numerous pending pods, could cause  large 
> jobs that arrive first to reserve resources on nodes. This reservation 
> mechanism prevents new jobs from getting necessary resources, which agains 
> preemption.
> *Test case:*
> Pls refer to attached files. 
>  # Submit test-job1 to queue-one
>  # Once test-job1 is running, Submit test-job2 to queue-two
>  # Once test-job2 is running and pending memory reaches to more than 40TB, 
> Submit test-job3 to queue-three
> *Proposal:*
> YuniKorn incorporates multiple scenarios for making reservations. To address 
> the current issue, we propose retaining only the preemption-related 
> reservations, as preemption relies on reservations to ensure that resources 
> can be reallocated later.
> The rationale for removing other reservation scenarios is as follows:
>  # If a queue's usage exceeds its guaranteed resources, it should not 
> maintain reservations.
>  # Conversely, if a queue's usage falls below its guaranteed resources, it 
> should be able to secure resources through preemption.
> *Our fix:* 
> We applied the fix internally to remove allocation case here 
> [https://github.com/apache/yunikorn-core/blob/master/pkg/scheduler/objects/application.go#L1532]
>  
>  
> Seems reservation 
> [https://yunikorn.apache.org/release-announce/0.8.0/#resource-reservation] is 
> by design, but in our case it agains preemption
>  I would like to open this ticket to have a follow up discussion with the 
> community to see what will be the better solution to address this issue.  cc 
> [~wilfreds] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Comment Edited] (YUNIKORN-2864) Add e2e tests for InPlacePodVerticalScaling feature

2025-01-24 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916893#comment-17916893
 ] 

Craig Condit edited comment on YUNIKORN-2864 at 1/25/25 6:32 AM:
-

[~kaichiaboy], we need to be sure that we are testing how YuniKorn interacts 
with the feature (not the feature itself). Specifically, we need to ensure that 
not only are pods updated correctly, but that YuniKorn's internal accounting 
remains correct throughout the process. Additionally, Kubernetes 1.32 
introduced some more changes to how vertical pod scaling works, so we should 
probably integrate YUNIKORN-2996 before we do this one.


was (Author: ccondit):
[~kaichiaboy], we need to be sure that we are testing how YuniKorn interacts 
with the feature (not the feature itself). Specifically, we need to ensure that 
not only are pods updated correctly, but that YuniKorn's internal accounting 
remains correct throughout the process. Additionally, Kubernetes 1.32 
introduced some more changes to how vertical pod scaling works, so we should 
probably integrate YUNIKORN-2996 before we do this one. 
{color:#ff}https://yunikorn.apache.org/release-announce/1.6{color}{color:#ff}.1{color}

> Add e2e tests for InPlacePodVerticalScaling feature
> ---
>
> Key: YUNIKORN-2864
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2864
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: shim - kubernetes
>Reporter: Craig Condit
>Assignee: Kaichia Chen
>Priority: Major
>
> Build e2e tests to verify YuniKOrn behavior when the 
> InPlacePodVerticalScaling feature flag is enabled. This is possible in K8s 
> 1.27 and later. Tests should be skipped on K8s 1.26 or earlier, or if the 
> feature flag is not enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2864) Add e2e tests for InPlacePodVerticalScaling feature

2025-01-24 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916893#comment-17916893
 ] 

Craig Condit commented on YUNIKORN-2864:


[~kaichiaboy], we need to be sure that we are testing how YuniKorn interacts 
with the feature (not the feature itself). Specifically, we need to ensure that 
not only are pods updated correctly, but that YuniKorn's internal accounting 
remains correct throughout the process. Additionally, Kubernetes 1.32 
introduced some more changes to how vertical pod scaling works, so we should 
probably integrate YUNIKORN-2996 before we do this one. 
{color:#ff}https://yunikorn.apache.org/release-announce/1.6{color}{color:#ff}.1{color}

> Add e2e tests for InPlacePodVerticalScaling feature
> ---
>
> Key: YUNIKORN-2864
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2864
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: shim - kubernetes
>Reporter: Craig Condit
>Assignee: Kaichia Chen
>Priority: Major
>
> Build e2e tests to verify YuniKOrn behavior when the 
> InPlacePodVerticalScaling feature flag is enabled. This is possible in K8s 
> 1.27 and later. Tests should be skipped on K8s 1.26 or earlier, or if the 
> feature flag is not enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3012) [UMBRELLA] YuniKorn 1.6.1 release efforts

2025-01-24 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3012.

Fix Version/s: 1.6.1
   Resolution: Fixed

All tasks resolved.

> [UMBRELLA] YuniKorn 1.6.1 release efforts
> -
>
> Key: YUNIKORN-3012
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3012
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: release
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
> Fix For: 1.6.1
>
>
> Changes:
> https://issues.apache.org/jira/browse/YUNIKORN-3005?filter=12353780



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3015) Update website for 1.6.1

2025-01-24 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3015.

Fix Version/s: 1.6.1
   Resolution: Fixed

Merged to master.

> Update website for 1.6.1
> 
>
> Key: YUNIKORN-3015
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3015
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: website
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1
>
>
> Multiple tasks all need to be done at once:
>  * create versioned docs
>  * create release announcement
>  * update downloads page
>  * update doap file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3013) Tagging for 1.6.1

2025-01-24 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3013.

Fix Version/s: 1.6.1
   Resolution: Fixed

Final tagging complete.

> Tagging for 1.6.1
> -
>
> Key: YUNIKORN-3013
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3013
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
> Fix For: 1.6.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Deleted] (YUNIKORN-3014) Release notes for 1.6.1

2025-01-24 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit deleted YUNIKORN-3014:
---


> Release notes for 1.6.1
> ---
>
> Key: YUNIKORN-3014
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3014
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2953) Placeholder release count incorrect

2025-01-21 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2953:
---
 Fix Version/s: 1.6.1
Target Version: 1.7.0, 1.6.1  (was: 1.7.0)

> Placeholder release count incorrect
> ---
>
> Key: YUNIKORN-2953
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2953
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: core - scheduler
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.1
>
>
> Even after YUNIKORN-2926 we have not fully fixed the placeholder release 
> count issue. 
> The release of allocated placeholders is counted double on timeout first on 
> release as part of the cleanup that is triggered. Then when the allocation is 
> really removed it is tracked again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-3012) [UMBRELLA] YuniKorn 1.6.1 release efforts

2025-01-15 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913401#comment-17913401
 ] 

Craig Condit commented on YUNIKORN-3012:


RC1 artifacts here: https://dist.apache.org/repos/dist/dev/yunikorn/1.6.1-RC1/

> [UMBRELLA] YuniKorn 1.6.1 release efforts
> -
>
> Key: YUNIKORN-3012
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3012
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: release
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> Changes:
> https://issues.apache.org/jira/browse/YUNIKORN-3005?filter=12353780



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3012) [UMBRELLA] YuniKorn 1.6.1 release efforts

2025-01-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3012:
---
Description: 
Changes:

https://issues.apache.org/jira/browse/YUNIKORN-3005?filter=12353780

> [UMBRELLA] YuniKorn 1.6.1 release efforts
> -
>
> Key: YUNIKORN-3012
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3012
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: release
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> Changes:
> https://issues.apache.org/jira/browse/YUNIKORN-3005?filter=12353780



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3000) Add support for setting GOGC / GOMEMLIMIT in Helm chart

2025-01-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3000:
---
Target Version: 1.7.0, 1.6.1  (was: 1.7.0)

> Add support for setting GOGC / GOMEMLIMIT in Helm chart
> ---
>
> Key: YUNIKORN-3000
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3000
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: release, release-notes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} environment variable to 
> limit the amount of RAM the process will use before garbage collecting. This 
> is a soft limit, but helps constrain resource usage and can help avoid being 
> OOMKilled by Kubernetes. Since we set our existing memory request to 1 GiB, 
> and limit to 2 GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 
> GiB. This allows for non-GC overhead plus transient spikes, and should keep 
> us well under the threshold for being OOMKilled. We should also allow 
> overriding {{GOGC}} as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3000) Add support for setting GOGC / GOMEMLIMIT in Helm chart

2025-01-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3000:
---
Fix Version/s: 1.6.1

> Add support for setting GOGC / GOMEMLIMIT in Helm chart
> ---
>
> Key: YUNIKORN-3000
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3000
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: release, release-notes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.1
>
>
> As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} environment variable to 
> limit the amount of RAM the process will use before garbage collecting. This 
> is a soft limit, but helps constrain resource usage and can help avoid being 
> OOMKilled by Kubernetes. Since we set our existing memory request to 1 GiB, 
> and limit to 2 GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 
> GiB. This allows for non-GC overhead plus transient spikes, and should keep 
> us well under the threshold for being OOMKilled. We should also allow 
> overriding {{GOGC}} as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3015) Update website for 1.6.1

2025-01-15 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3015:
--

 Summary: Update website for 1.6.1
 Key: YUNIKORN-3015
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3015
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: website
Reporter: Craig Condit
Assignee: Craig Condit


Multiple tasks all need to be done at once:
 * create versioned docs
 * create release announcement
 * update downloads page
 * update doap file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3014) Release notes for 1.6.0

2025-01-15 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3014:
--

 Summary: Release notes for 1.6.0
 Key: YUNIKORN-3014
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3014
 Project: Apache YuniKorn
  Issue Type: Sub-task
Reporter: Craig Condit
Assignee: Craig Condit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2871) Update website for 1.6.0

2025-01-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2871:
---
Target Version: 1.6.0

> Update website for 1.6.0
> 
>
> Key: YUNIKORN-2871
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2871
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> Multiple tasks all need to be done at once:
>  * create versioned docs
>  * create release announcement
>  * update downloads page
>  * update roadmap doc
>  * update doap file
>  * K8s supported versions update to add 1.30 and 1.31



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3014) Release notes for 1.6.1

2025-01-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3014:
---
Component/s: website
Summary: Release notes for 1.6.1  (was: Release notes for 1.6.0)

> Release notes for 1.6.1
> ---
>
> Key: YUNIKORN-3014
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3014
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: website
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3012) [UMBRELLA] YuniKorn 1.6.1 release efforts

2025-01-15 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3012:
--

 Summary: [UMBRELLA] YuniKorn 1.6.1 release efforts
 Key: YUNIKORN-3012
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3012
 Project: Apache YuniKorn
  Issue Type: Task
  Components: release
Reporter: Craig Condit
Assignee: Craig Condit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3013) Tagging for 1.6.1

2025-01-15 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3013:
--

 Summary: Tagging for 1.6.1
 Key: YUNIKORN-3013
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3013
 Project: Apache YuniKorn
  Issue Type: Sub-task
Reporter: Craig Condit
Assignee: Craig Condit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2977) [Umbrella] DaemonSet preemption hardening

2024-12-19 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2977.

Fix Version/s: 1.7.0
   1.6.1
   Resolution: Fixed

> [Umbrella] DaemonSet preemption hardening
> -
>
> Key: YUNIKORN-2977
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2977
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 1.7.0, 1.6.1
>
>
> We identified a couple of issues with DaemonSet preemption. The current 
> implementation is not stable.
> Notable issues:
> 1. Flooding the logs from {{tryAllocate()}} because unreservation fails
> 2. Flooding the logs if there are no victims found
> 3. Allocation is stuck if the DS pod cannot run on the target node due to 
> predicate errors



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3002) Update Go dependencies for CVE fixes

2024-12-19 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3002.

Fix Version/s: 1.7.0
   1.6.1
   Resolution: Fixed

Merged both PRs to master and cherry-picked to branch-1.6 in shim and core.

> Update Go dependencies for CVE fixes
> 
>
> Key: YUNIKORN-3002
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3002
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: core - scheduler, shim - kubernetes
>Reporter: Wilfred Spiegelenburg
>Assignee: Tzu-Hua Lan
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.1
>
>
> Apply changes for new CVE:
>  * CVE-2024-45337: CRITICAL, not affected as we do not use SSH, just upgrade 
> crypto to satisfy scanner, golang.org/x/crypto@v0.31.0
>  * CVE-2024-45338: HIGH, not affected as we do not use html parsing, just 
> upgrade to satisfy the scanner golang.org/x/net#v0.33.0
> h3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3001) Document new helm chart settings for GOGC / GOMEMLIMIT

2024-12-16 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3001.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Document new helm chart settings for GOGC / GOMEMLIMIT
> --
>
> Key: YUNIKORN-3001
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3001
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Document changes made in YUNIKORN-3000.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2999) Bump e2e test for core to Kubernetes v1.32.0

2024-12-16 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2999.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Bump e2e test for core to Kubernetes v1.32.0
> 
>
> Key: YUNIKORN-2999
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2999
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Now that Kubernetes v1.32.0 has been released, we should start testing 
> pre-commits on that version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3001) Document new helm chart settings for GOGC / GOMEMLIMIT

2024-12-13 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3001:
--

 Summary: Document new helm chart settings for GOGC / GOMEMLIMIT
 Key: YUNIKORN-3001
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3001
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: documentation
Reporter: Craig Condit
Assignee: Craig Condit


Document changes made in YUNIKORN-3000.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-3000) Add support for setting GOGC / GOMEMLIMIT in Helm chart

2024-12-13 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-3000.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Add support for setting GOGC / GOMEMLIMIT in Helm chart
> ---
>
> Key: YUNIKORN-3000
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3000
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: release, release-notes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} environment variable to 
> limit the amount of RAM the process will use before garbage collecting. This 
> is a soft limit, but helps constrain resource usage and can help avoid being 
> OOMKilled by Kubernetes. Since we set our existing memory request to 1 GiB, 
> and limit to 2 GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 
> GiB. This allows for non-GC overhead plus transient spikes, and should keep 
> us well under the threshold for being OOMKilled. We should also allow 
> overriding {{GOGC}} as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3000) Add support for setting GOGC / GOMEMLIMIT in Helm chart

2024-12-13 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3000:
---
Description: As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} 
environment variable to limit the amount of RAM the process will use before 
garbage collecting. This is a soft limit, but helps constrain resource usage 
and can help avoid being OOMKilled by Kubernetes. Since we set our existing 
memory request to 1 GiB, and limit to 2 GiB, I propose setting the default 
{{GOMEMLIMIT}} value to 1.5 GiB. This allows for non-GC overhead plus transient 
spikes, and should keep us well under the threshold for being OOMKilled. We 
should also allow overriding {{GOGC}} as well.  (was: As of Go 1.19, Go 
supports reading a {{GOMEMLIMIT}} environment variable to limit the amount of 
RAM the process will use before garbage collecting. This is a soft limit, but 
helps constrain resource usage and can help avoid being OOMKilled by 
Kubernetes. Since we set our existing memory request to 1 GiB, and limit to 2 
GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 GiB. This allows 
for non-GC overhead plus transient spikes, and should keep us well under the 
threshold for being OOMKilled.)

> Add support for setting GOGC / GOMEMLIMIT in Helm chart
> ---
>
> Key: YUNIKORN-3000
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3000
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: release, release-notes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} environment variable to 
> limit the amount of RAM the process will use before garbage collecting. This 
> is a soft limit, but helps constrain resource usage and can help avoid being 
> OOMKilled by Kubernetes. Since we set our existing memory request to 1 GiB, 
> and limit to 2 GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 
> GiB. This allows for non-GC overhead plus transient spikes, and should keep 
> us well under the threshold for being OOMKilled. We should also allow 
> overriding {{GOGC}} as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-3000) Add support for setting GOGC / GOMEMLIMIT in Helm chart

2024-12-13 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-3000:
---
Summary: Add support for setting GOGC / GOMEMLIMIT in Helm chart  (was: Add 
support for setting GOMEMLIMIT in Helm chart)

> Add support for setting GOGC / GOMEMLIMIT in Helm chart
> ---
>
> Key: YUNIKORN-3000
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3000
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: release, release-notes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} environment variable to 
> limit the amount of RAM the process will use before garbage collecting. This 
> is a soft limit, but helps constrain resource usage and can help avoid being 
> OOMKilled by Kubernetes. Since we set our existing memory request to 1 GiB, 
> and limit to 2 GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 
> GiB. This allows for non-GC overhead plus transient spikes, and should keep 
> us well under the threshold for being OOMKilled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-3000) Add support for setting GOMEMLIMIT in Helm chart

2024-12-13 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-3000:
--

 Summary: Add support for setting GOMEMLIMIT in Helm chart
 Key: YUNIKORN-3000
 URL: https://issues.apache.org/jira/browse/YUNIKORN-3000
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: release, release-notes
Reporter: Craig Condit
Assignee: Craig Condit


As of Go 1.19, Go supports reading a {{GOMEMLIMIT}} environment variable to 
limit the amount of RAM the process will use before garbage collecting. This is 
a soft limit, but helps constrain resource usage and can help avoid being 
OOMKilled by Kubernetes. Since we set our existing memory request to 1 GiB, and 
limit to 2 GiB, I propose setting the default {{GOMEMLIMIT}} value to 1.5 GiB. 
This allows for non-GC overhead plus transient spikes, and should keep us well 
under the threshold for being OOMKilled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2995) Update e2e test matrix to support v1.32.0

2024-12-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2995.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Update e2e test matrix to support v1.32.0
> -
>
> Key: YUNIKORN-2995
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2995
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Update e2e test matrix to update to the latest kind, as well as Kubernetes 
> v1.32.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2999) Bump e2e test for core to Kubernetes v1.32.0

2024-12-12 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2999:
--

 Summary: Bump e2e test for core to Kubernetes v1.32.0
 Key: YUNIKORN-2999
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2999
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: core - common
Reporter: Craig Condit
Assignee: Craig Condit


Now that Kubernetes v1.32.0 has been released, we should start testing 
pre-commits on that version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2995) Update e2e test matrix to support v1.32.0

2024-12-11 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2995:
--

 Summary: Update e2e test matrix to support v1.32.0
 Key: YUNIKORN-2995
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2995
 Project: Apache YuniKorn
  Issue Type: Task
  Components: shim - kubernetes
Reporter: Craig Condit
Assignee: Craig Condit


Update e2e test matrix to update to the latest kind, as well as Kubernetes 
v1.32.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2996) Update build dependency to kubernetes v1.32.0

2024-12-11 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2996:
---
Description: Kubernetes v1.32.0 is out, and has several scheduler-related 
changes. We should plan to update our build dependency to match.

> Update build dependency to kubernetes v1.32.0
> -
>
> Key: YUNIKORN-2996
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2996
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>
> Kubernetes v1.32.0 is out, and has several scheduler-related changes. We 
> should plan to update our build dependency to match.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2996) Update build dependency to kubernetes v1.32.0

2024-12-11 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2996:
--

 Summary: Update build dependency to kubernetes v1.32.0
 Key: YUNIKORN-2996
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2996
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: shim - kubernetes
Reporter: Craig Condit
Assignee: Craig Condit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2992) Emit warning message in plugin mode

2024-12-11 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2992.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Emit warning message in plugin mode
> ---
>
> Key: YUNIKORN-2992
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2992
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: shim - kubernetes
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Based on our discussion on yunikorn-dev mailing list, we have the following 
> timeline for the plugin mode removal:
> - YuniKorn 1.6.0 - Announce the deprecation of the plugin model, but no code 
> changes.
> - YuniKorn 1.7.0 - Emit warnings when the plugin mode is active, but nothing 
> else.
> - YuniKorn 1.8.0 - Stop testing and building the plugin as part of the normal 
> development cycle.
> - YuniKorn 1.9.0 - Remove the implementation entirely.
> The next release is 1.7.0, so the warning message needs to be added.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2993) YuniKorn UI not showing applications when stateLog not exist

2024-12-05 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2993.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> YuniKorn UI not showing applications when stateLog not exist
> 
>
> Key: YUNIKORN-2993
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2993
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: webapp
>Reporter: Xi Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
> Attachments: image-2024-11-11-12-25-07-312.png
>
>
> We encountered the following issue from YuniKorn UI during peak hours. There 
> were hundreds of applications in the queue but UI was showing nothing.
> The reason is that `stateLog` doesn't exist for some application in the 
> return value of API 
> `/ws/v1/partition/${partitionName}/queue/${queueName}/applications`.
> !image-2024-11-11-12-25-07-312.png|width=1403,height=384!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-1728) MaxApplication enforcement supports percentage of resources

2024-11-30 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902066#comment-17902066
 ] 

Craig Condit commented on YUNIKORN-1728:


This absolutely needs a comprehensive design doc before proceeding. 

> MaxApplication enforcement supports percentage of resources
> ---
>
> Key: YUNIKORN-1728
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1728
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Rainie Li
>Assignee: Rainie Li
>Priority: Major
>
> Currently we need to set queue with guaranteed resources.
> example:
> {code:java}
> queues:
>   - name: root
> submitacl: '*'
> queues:
>   - name: queue1
> submitacl: '*'
> maxapplications: 12
> resources:
>   guaranteed:
> {memory: 6290G, vcore: 816}
>   max:
> {memory: 31450G, vcore: 4080}
> {code}
> It will be convenient to support percentage, so that we can configure queue 
> without calculating the actual number. 
> {code:java}
> queues:
>   - name: root
> submitacl: '*'
> queues:
>   - name: queue1
> submitacl: '*'
> maxapplications: 12
> resources:
>   guaranteed:
> {memory: 20%, vcore: 20%}
>   max:
> {memory: 31450G, vcore: 4080}
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2974) Expose fencing details in the queue REST info

2024-11-22 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2974.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Expose fencing details in the queue REST info
> -
>
> Key: YUNIKORN-2974
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2974
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> The queue REST info does not contain the fencing information and is missing 
> some other details.
> Need to add the missing pieces 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2925) Remove internal objects from application REST response

2024-11-22 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2925.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Remove internal objects from application REST response
> --
>
> Key: YUNIKORN-2925
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2925
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available, release-notes
> Fix For: 1.7.0
>
>
> The REST api for application objects exposes an internal object type 
> (resource) directly without conversion. That means any internal 
> representation change will break REST compatibility. This should never have 
> happened and needs to be reversed ASAP. All other REST calls 
> The other problem with the exposed information is that it is only accurate 
> for the COMPLETING or COMPLETED state of an application. The data is 
> incomplete at any other state as it is only updated when an allocation 
> finishes. Running allocations are not included. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Reopened] (YUNIKORN-2977) [Umbrella] DaemonSet preemption hardening

2024-11-20 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit reopened YUNIKORN-2977:


> [Umbrella] DaemonSet preemption hardening
> -
>
> Key: YUNIKORN-2977
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2977
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 1.7.0, 1.6.1
>
>
> We identified a couple of issues with DaemonSet preemption. The current 
> implementation is not stable.
> Notable issues:
> 1. Flooding the logs from {{tryAllocate()}} because unreservation fails
> 2. Flooding the logs if there are no victims found
> 3. Allocation is stuck if the DS pod cannot run on the target node due to 
> predicate errors



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2901) when creating new queues, queue name is used as queue path

2024-11-20 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2901.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> when creating new queues, queue name is used as queue path
> --
>
> Key: YUNIKORN-2901
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2901
> Project: Apache YuniKorn
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0
>Reporter: Hengzhe Guo
>Assignee: Hengzhe Guo
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 1.7.0
>
>
> At 
> [https://github.com/apache/yunikorn-core/blame/master/pkg/scheduler/objects/queue.go#L121]
>  in NewConfiguredQueue, new queue's name is made the path. For non-root 
> queues, the path is later correctly set as full path at line 137. But several 
> actions between them use this name as path, causing issues like emitting 
> metrics with wrong label



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2977) [Umbrella] DaemonSet preemption hardening

2024-11-20 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2977.

Fix Version/s: 1.7.0
   1.6.1
 Assignee: Peter Bacsko
   Resolution: Fixed

Merged to master and cherry-picked to branch-1.6.

> [Umbrella] DaemonSet preemption hardening
> -
>
> Key: YUNIKORN-2977
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2977
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 1.7.0, 1.6.1
>
>
> We identified a couple of issues with DaemonSet preemption. The current 
> implementation is not stable.
> Notable issues:
> 1. Flooding the logs from {{tryAllocate()}} because unreservation fails
> 2. Flooding the logs if there are no victims found
> 3. Allocation is stuck if the DS pod cannot run on the target node due to 
> predicate errors



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2980) DaemonSet preemption: don't flood the logs if victim selection fails

2024-11-20 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2980.

Fix Version/s: 1.7.0
   1.6.1
   Resolution: Fixed

Merged to master and cherry-picked to branch-1.6.

> DaemonSet preemption: don't flood the logs if victim selection fails
> 
>
> Key: YUNIKORN-2980
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2980
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.1
>
>
> If we can't find a proper victim for a DaemonSet pod, we constaly print the 
> following to the console in every scheduling cycle:
> {noformat}
> log.Log(log.SchedApplication).Info("Triggering preemption process for daemon 
> set ask",
>   zap.String("ds allocation key", ask.GetAllocationKey()))
> [...]
> log.Log(log.SchedApplication).Warn("Problem in finding the victims for 
> preempting resources to meet required ask requirements",
>   zap.String("ds allocation key", ask.GetAllocationKey()),
>   zap.String("node id", reserve.nodeID))
> {noformat}
> Since we attempt to schedule in every 100ms, this is logged at least 10 times 
> in every second.
> Suggestion: do the preemption silently and only log if it has succeeded, just 
> like regular preemption. Otherwise, use {{Allocation.LogAllocationFailure()}} 
> which we already do for a variety of things.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] (YUNIKORN-2977) [Umbrella] DaemonSet preemption hardening

2024-11-20 Thread Craig Condit (Jira)


[ https://issues.apache.org/jira/browse/YUNIKORN-2977 ]


Craig Condit deleted comment on YUNIKORN-2977:


was (Author: ccondit):
Merged to master and cherry-picked to branch-1.6.

> [Umbrella] DaemonSet preemption hardening
> -
>
> Key: YUNIKORN-2977
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2977
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>
> We identified a couple of issues with DaemonSet preemption. The current 
> implementation is not stable.
> Notable issues:
> 1. Flooding the logs from {{tryAllocate()}} because unreservation fails
> 2. Flooding the logs if there are no victims found
> 3. Allocation is stuck if the DS pod cannot run on the target node due to 
> predicate errors



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2977) [Umbrella] DaemonSet preemption hardening

2024-11-20 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2977:
---
Fix Version/s: (was: 1.7.0)
   (was: 1.6.1)

> [Umbrella] DaemonSet preemption hardening
> -
>
> Key: YUNIKORN-2977
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2977
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>
> We identified a couple of issues with DaemonSet preemption. The current 
> implementation is not stable.
> Notable issues:
> 1. Flooding the logs from {{tryAllocate()}} because unreservation fails
> 2. Flooding the logs if there are no victims found
> 3. Allocation is stuck if the DS pod cannot run on the target node due to 
> predicate errors



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2978) Orphan allocations due to reservation host mismatch

2024-11-19 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2978.

Fix Version/s: 1.7.0
   1.6.1
   Resolution: Fixed

Merged to master and branch-1.6.

> Orphan allocations due to reservation host mismatch
> ---
>
> Key: YUNIKORN-2978
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2978
> Project: Apache YuniKorn
>  Issue Type: Bug
>Affects Versions: 1.6.0
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.1
>
>
> When allocating after a reservation, if the target node doesn't match the 
> reserved node, we record the wrong information in the pod (the reserved node) 
> instead of the target node. This results in orphan allocations as the 
> pod->node relationship doesn't match node->pod.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2958) Display foreign allocations on the web UI

2024-11-18 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2958.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Display foreign allocations on the web UI
> -
>
> Key: YUNIKORN-2958
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2958
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: webapp
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Now that we track individual non-Yunikorn pods, we should display them on the 
> UI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2961) Move node utilizations chart to dashboard page

2024-11-18 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2961.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Move node utilizations chart to dashboard page
> --
>
> Key: YUNIKORN-2961
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2961
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: webapp
>Reporter: JunHong Peng
>Assignee: JunHong Peng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Following [~wilfreds]'s advice, adding a node utilization chart to the 
> dashboard page will make it more complete



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2895) Don't add duplicated allocation to node when the allocation ask fails

2024-11-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2895.

Resolution: Invalid

Resolving this as root cause has been identified and should be fixed by 
YUNIKORN-2978.

> Don't add duplicated allocation to node when the allocation ask fails
> -
>
> Key: YUNIKORN-2895
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2895
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: core - scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
>  Labels: pull-request-available
> Attachments: orphaned_dataops_1.6_patched.json
>
>
> When i try to revisit the new update allocation logic, the potential 
> duplicated allocation to node will happen if the allocation already 
> allocated.  And we try to add the allocation to the node again and don't 
> revert it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-2895) Don't add duplicated allocation to node when the allocation ask fails

2024-11-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit closed YUNIKORN-2895.
--

> Don't add duplicated allocation to node when the allocation ask fails
> -
>
> Key: YUNIKORN-2895
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2895
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: core - scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
>  Labels: pull-request-available
> Attachments: orphaned_dataops_1.6_patched.json
>
>
> When i try to revisit the new update allocation logic, the potential 
> duplicated allocation to node will happen if the allocation already 
> allocated.  And we try to add the allocation to the node again and don't 
> revert it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2978) Orphan allocations due to reservation host mismatch

2024-11-15 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2978:
---
Priority: Critical  (was: Major)

> Orphan allocations due to reservation host mismatch
> ---
>
> Key: YUNIKORN-2978
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2978
> Project: Apache YuniKorn
>  Issue Type: Bug
>Affects Versions: 1.6.0
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Critical
>
> When allocating after a reservation, if the target node doesn't match the 
> reserved node, we record the wrong information in the pod (the reserved node) 
> instead of the target node. This results in orphan allocations as the 
> pod->node relationship doesn't match node->pod.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2895) Don't add duplicated allocation to node when the allocation ask fails

2024-11-15 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898758#comment-17898758
 ] 

Craig Condit commented on YUNIKORN-2895:


I believe I've identified the root cause of the node mismatch as YUNIKORN-2978. 
Let's continue the discussion there.

> Don't add duplicated allocation to node when the allocation ask fails
> -
>
> Key: YUNIKORN-2895
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2895
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: core - scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
>  Labels: pull-request-available
> Attachments: orphaned_dataops_1.6_patched.json
>
>
> When i try to revisit the new update allocation logic, the potential 
> duplicated allocation to node will happen if the allocation already 
> allocated.  And we try to add the allocation to the node again and don't 
> revert it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2978) Orphan allocations due to reservation host mismatch

2024-11-15 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2978:
--

 Summary: Orphan allocations due to reservation host mismatch
 Key: YUNIKORN-2978
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2978
 Project: Apache YuniKorn
  Issue Type: Bug
Affects Versions: 1.6.0
Reporter: Craig Condit
Assignee: Craig Condit


When allocating after a reservation, if the target node doesn't match the 
reserved node, we record the wrong information in the pod (the reserved node) 
instead of the target node. This results in orphan allocations as the pod->node 
relationship doesn't match node->pod.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2972) Remove Resource object from user group REST API

2024-11-14 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898388#comment-17898388
 ] 

Craig Condit commented on YUNIKORN-2972:


Merged shim PR as well.

> Remove Resource object from user group REST API
> ---
>
> Key: YUNIKORN-2972
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2972
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Splitting off the removal of the Resource object from the user group REST API 
> from the tracked resources fix.
> The UGM code throughout its tests cases relied on the Resource object in the 
> REST wrapper. It extends into the scheduler test code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2967) Cleanup REST response headers

2024-11-14 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2967.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Cleanup REST response headers
> -
>
> Key: YUNIKORN-2967
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2967
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> The REST responses set a standard header set on all responses.
> The [RFC|https://datatracker.ietf.org/doc/html/rfc7480#section-5.6] says for 
> CORS headers:
> {code:java}
> Use of the Access-Control-Allow-Credentials header field is NOT 
> RECOMMENDED.{code}
> We set that header to TRUE, we should not do that.
> All methods are part of all responses in the Access-Control-Allow-Methods 
> list. That is not correct, we do not support HEAD and only POST for one. We 
> should not set all of these methods, just the GET or POST beside the OPTIONS 
> that is supported.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2972) Remove Resource object from user group REST API

2024-11-13 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2972.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Remove Resource object from user group REST API
> ---
>
> Key: YUNIKORN-2972
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2972
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Splitting off the removal of the Resource object from the user group REST API 
> from the tracked resources fix.
> The UGM code throughout its tests cases relied on the Resource object in the 
> REST wrapper. It extends into the scheduler test code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2972) Remove Resource object from user group REST API

2024-11-13 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2972:
---
Summary: Remove Resource object from user group REST API  (was: Remove 
Resource object from user group REST)

> Remove Resource object from user group REST API
> ---
>
> Key: YUNIKORN-2972
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2972
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
>
> Splitting off the removal of the Resource object from the user group REST API 
> from the tracked resources fix.
> The UGM code throughout its tests cases relied on the Resource object in the 
> REST wrapper. It extends into the scheduler test code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2972) Remove Resource object from user group REST

2024-11-13 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2972:
---
Summary: Remove Resource object from user group REST  (was: remove Resource 
object from user group REST)

> Remove Resource object from user group REST
> ---
>
> Key: YUNIKORN-2972
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2972
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
>
> Splitting off the removal of the Resource object from the user group REST API 
> from the tracked resources fix.
> The UGM code throughout its tests cases relied on the Resource object in the 
> REST wrapper. It extends into the scheduler test code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2957) Improve visual in dashboard page

2024-11-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2957.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Improve visual in dashboard page
> 
>
> Key: YUNIKORN-2957
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2957
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: webapp
>Reporter: JunHong Peng
>Assignee: JunHong Peng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> - Ensure consistent coloring for app statuses across multiple charts
> (helps users easily recognize and distinguish statuses without confusion)
> - Design new style of chart for clarity



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2971) Use FakeRecorder and MockedRecorder properly in the tests

2024-11-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2971.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Use FakeRecorder and MockedRecorder properly in the tests
> -
>
> Key: YUNIKORN-2971
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2971
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes, test - unit
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Certain test cases in the shim verify the type of the event recorder:
> {noformat}
>   recorder, ok := events.GetRecorder().(*k8sEvents.FakeRecorder)
>   if !ok {
>   t.Fatal("the EventRecorder is expected to be of type 
> FakeRecorder")
>   }
> {noformat}
> However, this just happens to pass by accident because a previous test 
> modified it from MockedRecorder to FakeRecorder. Running such a tests on its 
> own fails.
> Another potential issue is that tests don't restore the recorder in a 
> deferred section. Most of the time the restored type is a FakeRecorder which 
> is not correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2944) Update the k8shim-task-state image

2024-11-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2944.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Update the k8shim-task-state image
> --
>
> Key: YUNIKORN-2944
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2944
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: website
>Reporter: wangzhihui
>Assignee: wangzhihui
>Priority: Minor
> Fix For: 1.7.0
>
>
> The k8shim-task-state of the TaskAllocated  state doesn't exist in  Version 
> 1.1 to 1.6   
> [1.1 ~ 1.4] the k8shim-task-state is same
> [1.5~1.6] the k8shim-task-state is same



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2941) Remove plugin mode from the install section of Getting Started

2024-11-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2941.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Remove plugin mode from the install section of Getting Started
> --
>
> Key: YUNIKORN-2941
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2941
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: website
>Reporter: Michael Chu
>Assignee: Michael Chu
>Priority: Minor
>  Labels: newbie, pull-request-available
> Fix For: 1.7.0
>
>
> Since plugin mode is now deprecated and will be removed in a future release, 
> it would be better to remove the plugin mode in the install section to 
> prevent any confusion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2965) Move statedump to debug endpoint

2024-11-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2965.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged both PRs to master.

> Move statedump to debug endpoint
> 
>
> Key: YUNIKORN-2965
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2965
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
>  Labels: pull-request-available, release-notes
> Fix For: 1.7.0
>
>
> The statedump was added as a debug tool. The content of the statedump is not 
> fixed and can keep changing from release to release.
> Moving it to the debug endpoint to clearly show that it is not a stable 
> object that can be relied on or should be used to integrate with.
> The current endpoint will need to redirect with a 301 Moved Permanently for a 
> couple of releases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2970) Don't display node attributes by default on the UI

2024-11-12 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2970.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Don't display node attributes by default on the UI
> --
>
> Key: YUNIKORN-2970
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2970
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: webapp
>Reporter: Peter Bacsko
>Assignee: JunHong Peng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> On the nodes view, we display the "node attributes" by default. With a large 
> cluster, this looks really ugly and 99% of the information displayed is not 
> really useful. 
> In fact, if we click on "..." to have an expanded view and click on "..." 
> again, it disappears. This is how it should be displayed be default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2927) Update MockScheduler test case with foreign pod resource update

2024-11-08 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2927.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Update MockScheduler test case with foreign pod resource update
> ---
>
> Key: YUNIKORN-2927
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2927
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2966) Not all tags are created for foreign allocations

2024-11-08 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2966.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Not all tags are created for foreign allocations
> 
>
> Key: YUNIKORN-2966
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2966
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> When we create an Allocation request to the core, we don't populate 
> allocation tags properly in {{{}CreateAllocationForForeignPod(){}}}. We miss 
> the call to {{CreateTagsForTask()}} which adds a number of useful tags such 
> as namespace, pod name, labels, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-2962) Governance clarification: guidance requested on extending Yunikorn core functionality

2024-11-06 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit closed YUNIKORN-2962.
--
Resolution: Information Provided

> Governance clarification: guidance requested on extending Yunikorn core 
> functionality
> -
>
> Key: YUNIKORN-2962
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2962
> Project: Apache YuniKorn
>  Issue Type: Task
>Reporter: David Gantenbein
>Priority: Major
>
> Hey,
>  
> If you’re not aware, G-Research Open Source Software (GR-OSS), has been 
> working in and around the YuniKorn ecosystem for the last several months. 
> We’ve actively contributed a number of enhancements to the project, along 
> with several features related to our need for a persistent record of YuniKorn 
> events merged[0][1][2]. 
>  
> However, recently many of our contributions to Apache YuniKorn appear to have 
> been reverted unilaterally with minimal explanation. It’s unclear to us where 
> the open discussion about this removal, as required by the ASF Code of 
> Conduct, occurred for this – we’re interested and would like to participate 
> in those technical discussions in the future. We’ve tried to glean the 
> primary points of contention here:
>  
> First, our choice of name for the (formerly) YuniKorn-history-server project 
> was unwise given YuniKorn is a trademark of the Apache Software Foundation 
> (ASF). We’ve rectified this issue by renaming the project to 
> unicorn-history-server. The original name was driven by our hope that 
> unicorn-history-server may one day find its home as an official part of the 
> YuniKorn. We hope our swift resolution of this concern is evidence of our 
> commitment to the same open philosophies held by the ASF.
>  
> Next, Craig Condit stated a concern[3] that our changes were geared towards 
> permitting proprietary extensions to YuniKorn. GR-OSS is an open source 
> policy office that does not write any proprietary code as part of our 
> mission. In fact, the unicorn-history-server is Apache 2.0 licensed, just 
> like YuniKorn. Our team made extensive efforts to devise the most minimally 
> intrusive changes possible after it was suggested to us that it was better to 
> be out of tree – we’d be thrilled if the solution to this problem would be 
> the unicorn-history-server being adopted as part of yunikorn-core; in lieu of 
> this, keeping a plugin mechanism is a base-level requirement for the 
> unicorn-history-server to function. We hope that our reputation as good 
> upstream citizens and operators can help you understand that we have no 
> hidden agendas – we aren’t even a product company.
>  
> Finally, it was suggested[4] that the getApplication API endpoint was 
> inappropriate due to its exposure of internal YuniKorn data structures. We’re 
> open to feedback regarding this feature and how to improve it, but again, 
> we’re confused as to where these discussions are happening and how to get 
> involved in them. In the original proposal of this feature[5], we added tests 
> and modified the implementation at the request of project maintainers – it’s 
> upsetting to have all that work and cooperation discarded without even a 
> conversation.
>  
> Our desire is to remain a part of the YuniKorn community, but we’re very 
> confused about the governance and technical design process for the project – 
> according to [https://yunikorn.apache.org/community/people/], the maintainers 
> reverting our patches are at the same leadership level as those who approved 
> the patches originally. Can we get some clarity on the reasoning for 
> reverting the patches and documentation of the open community collaboration, 
> as required by the ASF Code of Conduct, that precipitated this removal – this 
> appeared to have been mentioned in the October 30th meeting[6], but this date 
> is after the revert of the patches, so we assume there must have been another 
> discussion.
>  
> If there’s anything further we need to do in order to spawn the technical 
> dialogue needed to address your concerns with unicorn-history-server and the 
> supporting elements, please let us know. Our desire is to implement something 
> in an open way that meets not only the needs of G-Research, but also those of 
> the overall YuniKorn community.
>  
> Thanks for your time,
> Rich Scott, Open Source Developer
> Denis Coric, Open Source Developer
> Jay Faulkner, Open Source Developer
> Dave Gantenbein, Director of Software Development
> Alexander Scammon, Head of Open Source Development
> G-Research Open Source Software
>  
> 0: https://issues.apache.org/jira/browse/YUNIKORN-2606
> 1: https://issues.apache.org/jira/browse/YUNIKORN-2652  
> 2: [http://tiny.cc/ag5tzz]
> 3: https://issues.apache.org/jira/browse/YUNIKORN-29

[jira] [Commented] (YUNIKORN-2962) Governance clarification: guidance requested on extending Yunikorn core functionality

2024-11-06 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896047#comment-17896047
 ] 

Craig Condit commented on YUNIKORN-2962:


With regard to the ASF Code of Conduct, no violations (alleged or otherwise) 
have occurred. The ASF Code of Conduct says nothing whatsoever about "requiring 
open discussion" about code reverts. That said, in each case that you have made 
this assertion, we have, in fact, operated by your own standard (i.e. "open 
discussion"). The open discussion happened on the JIRA and GitHub PRs for the 
relevant issues under discussion here.

A point of correction: there have not been "several contributions reverted" 
There has also only been *one* previously-committed PR that has been 
subsequently reverted:  YUNIKORN-2606 (Modular sidebar with remote components), 
which was reverted by YUNIKORN-2954. This JIRA was not "unilaterally reverted" 
– in fact, YUNIKORN-2954 was submitted by a PMC member (myself) along with 
relevant documentation as to why, and approved by [~pbacsko], another PMC 
member. In between, YUNIKORN-2949 (Load external Scheduler Service using Module 
Federation), which you failed to mention here, was submitted, building upon 
YUNIKORN-2606 and if committed, would have wholesale replaced huge portions of 
the YuniKorn Web UI without any user-visible indication that what was being 
displayed on screen was not, in fact, a part of Apache YuniKorn. This was 
rejected, but its submission triggered further review of YUNIKORN-2606 (the 
implications of which had not yet been fully understood) and it became apparent 
that this was not a direction we wanted to pursue. I opened YUNIKORN-2954 and 
provided my justification {*}in the JIRA description and pull request{*}. 
Nothing was done arbitrarily or in secret as you allege.

Additionally, the assertion that the "patches" (only one in fact) were reverted 
by maintainers at the "same leadership level as those who approved the 
patch[es] originally" is false – YUNIKORN-2606 was approved by a single 
committer, and the reversion was submitted by a PMC member and approved by a 
second PMC member.

The only other potential revert that is on the table is YUNIKORN-2925 (Remove 
internal objects from application REST endpoint), which was created by 
[~wilfreds], another PMC member. I agree with this revert as well; we don't 
want to have internal objects in the REST API, nor historical information. That 
is what we have built the YuniKorn event system for. [~pbacsko], who approved 
the original PR, has also commented on the reversion with additional REST API 
endpoints that should probably be cleaned up as well. This would seem to 
indicate that he too has had a change of heart regarding the wisdom of keeping 
the original PR intact. The fact is, we don't revert commits arbitrarily or 
frequently, but sometimes things slip through and we need to course-correct.

Now for some less-technical points...

Project naming of G-Research History Server: Simply changing the spelling of 
"yunikorn" to "unicorn" is not sufficient differentiation under U.S. Trademark 
Law, as this would almost certainly run afoul of the [confusingly 
similar|https://www.law.cornell.edu/wex/confusingly_similar] test. Some 
possible suggestions: Use your company name (i.e. G-Research History Server), a 
generic identifier (Scheduling History Server) or pick a distinct project name, 
i.e. Monocerus, another mythical creature related to the unicorn. Trademark law 
also allows you to reference trademarked entities in your documentation. For 
example, this would be okay: "G-Research History Server is a history service 
designed to integrate with the Apache YuniKorn scheduler for Kubernetes". This 
makes clear that your project is independent, while also providing clarity as 
to its purpose. Ultimately, it's your project – name it whatever you want 
(while respecting trademark law).

Regarding the comment that "it was suggested to us that it was better [for the 
history server] to be out of tree", this was discussed during the initial 
proposal by G-Research of the history server during the May 1, 2024 YuniKorn 
community meeting. As I recall, there were significant concerns raised about 
the validity of augmenting the REST API with history information. We were by 
this point well underway with designs and development of the (now mature) event 
system for YuniKorn, where real-time events would be emitted to an external 
consumer. A very large motivator for that design was to ensure that a future 
history server (yes, you're not the first to think of one) would be able to 
scale well and not bog down YuniKorn itself with non-scheduler overhead. The 
G-Research approach was very much at odds with that (already agreed upon) 
direction. When these concerns were raised, it was suggested that perhaps the 
G-Research history server would be better d

[jira] [Commented] (YUNIKORN-2925) Remove internal objects from application REST response

2024-11-04 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895394#comment-17895394
 ] 

Craig Condit commented on YUNIKORN-2925:


Historical information has no place in the REST API at all. That's what the 
event system is for.

> Remove internal objects from application REST response
> --
>
> Key: YUNIKORN-2925
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2925
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: release-notes
>
> The REST api for application objects exposes an internal object type 
> (resource) directly without conversion. That means any internal 
> representation change will break REST compatibility. This should never have 
> happened and needs to be reversed ASAP. All other REST calls 
> The other problem with the exposed information is that it is only accurate 
> for the COMPLETING or COMPLETED state of an application. The data is 
> incomplete at any other state as it is only updated when an allocation 
> finishes. Running allocations are not included. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2953) Placeholder release count incorrect

2024-11-04 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2953.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Placeholder release count incorrect
> ---
>
> Key: YUNIKORN-2953
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2953
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: core - scheduler
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Even after YUNIKORN-2926 we have not fully fixed the placeholder release 
> count issue. 
> The release of allocated placeholders is counted double on timeout first on 
> release as part of the cleanup that is triggered. Then when the allocation is 
> really removed it is tracked again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2956) Fix layout break on Queues v2 page

2024-10-30 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2956.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Fix layout break on Queues v2 page
> --
>
> Key: YUNIKORN-2956
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2956
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: webapp
>Reporter: JunHong Peng
>Assignee: JunHong Peng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2928) [core] Update foreign pod resource usage

2024-10-30 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2928.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> [core] Update foreign pod resource usage
> 
>
> Key: YUNIKORN-2928
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2928
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2951) Remove unnecessary locking from RequiredNodePreemptor

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2951.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Remove unnecessary locking from RequiredNodePreemptor
> -
>
> Key: YUNIKORN-2951
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2951
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - scheduler
>Reporter: Manikandan R
>Assignee: Hsien-Cheng(Ryan) Huang
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 1.7.0
>
>
> RequiredNodePreemptor use lock at some places before doing read and write at 
> some places. Based on the assessment, there is no reason to use locks and 
> should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Reopened] (YUNIKORN-2606) Modular sidebar with remote components

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit reopened YUNIKORN-2606:


> Modular sidebar with remote components
> --
>
> Key: YUNIKORN-2606
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2606
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: webapp
>Reporter: Denis Coric
>Assignee: Denis Coric
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
> Attachments: image-2024-05-07-18-25-08-070.png
>
>
> -We need a link to the external application that will display logs and more 
> details about the application or the pod itself.- 
> -External URLs can be defined in the form of a string template that can be 
> set as an env variable.-
> -If the variable is present on build time, the Logs link will be visible on 
> the UI.-
> To minimize changes in the YuniKorn itself and enable maximal customization 
> and easy connection with the YuniKorn History Server (YHS) that is being 
> developed, the easiest solution would be to add externally loaded component 
> by using module federation. Components will be served by the YHS server 
> (changes on YHS endpoints would reflect in web components as well) and loaded 
> in YuniKorn web with Module Federation. 
> This ticket should add the required configuration for loading a custom module 
> that will be enabled through the env variables. If env is not set, YuniKorn 
> will work as usual (no changes to the default behavior)
> !image-2024-05-07-18-25-08-070.png|width=1240,height=647!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2951) Remove unnecessary locking from RequiredNodePreemptor

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2951:
---
Summary: Remove unnecessary locking from RequiredNodePreemptor  (was: 
RequiredNodePreemptor doesn't require lock)

> Remove unnecessary locking from RequiredNodePreemptor
> -
>
> Key: YUNIKORN-2951
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2951
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - scheduler
>Reporter: Manikandan R
>Assignee: Hsien-Cheng(Ryan) Huang
>Priority: Major
>  Labels: newbie, pull-request-available
>
> RequiredNodePreemptor use lock at some places before doing read and write at 
> some places. Based on the assessment, there is no reason to use locks and 
> should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2609) Improve visual style of the Web UI

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2609.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

> Improve visual style of the Web UI
> --
>
> Key: YUNIKORN-2609
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2609
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: webapp
>Reporter: Denis Coric
>Assignee: JunHong Peng
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 1.7.0
>
>
> Implement required CSS changes to tweak the overall look and feel of the web 
> UI.
> The full design can be previewed on this link: [ 
> [DESIGN|https://xd.adobe.com/view/1d84899f-72a8-472f-b03f-de40451b0956-48d7/] 
> ]
> This should include:
>  * Fix padding/margin values
>  * Add rounding on elements to match the design (menu selection, dropdowns, 
> etc)
>  * Fix font weight on visual elements to match the design
> _Note: Queues page can be skipped as it is being redesigned in YUNIKORN-2341_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-2606) Modular sidebar with remote components

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit closed YUNIKORN-2606.
--
 Fix Version/s: (was: 1.7.0)
Target Version:   (was: 1.7.0)
Resolution: Won't Do

Removed in YUNIKORN-2954.

> Modular sidebar with remote components
> --
>
> Key: YUNIKORN-2606
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2606
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: webapp
>Reporter: Denis Coric
>Assignee: Denis Coric
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-05-07-18-25-08-070.png
>
>
> -We need a link to the external application that will display logs and more 
> details about the application or the pod itself.- 
> -External URLs can be defined in the form of a string template that can be 
> set as an env variable.-
> -If the variable is present on build time, the Logs link will be visible on 
> the UI.-
> To minimize changes in the YuniKorn itself and enable maximal customization 
> and easy connection with the YuniKorn History Server (YHS) that is being 
> developed, the easiest solution would be to add externally loaded component 
> by using module federation. Components will be served by the YHS server 
> (changes on YHS endpoints would reflect in web components as well) and loaded 
> in YuniKorn web with Module Federation. 
> This ticket should add the required configuration for loading a custom module 
> that will be enabled through the env variables. If env is not set, YuniKorn 
> will work as usual (no changes to the default behavior)
> !image-2024-05-07-18-25-08-070.png|width=1240,height=647!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2954) Remove so-called modular sidebar

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2954:
---
Target Version:   (was: 1.7.0)

> Remove so-called modular sidebar
> 
>
> Key: YUNIKORN-2954
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2954
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: webapp
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
>
> We need to revert YUNIKORN-2606, as it should never have been merged. It has 
> become clear that it exists only to provide invasive hooks for adding 
> proprietary and/or non-standard components to YuniKorn. It also opens up 
> YuniKorn to potential remote code execution vulnerabilities. This goes 
> against our open development philosophy. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-2954) Remove so-called modular sidebar

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit closed YUNIKORN-2954.
--

> Remove so-called modular sidebar
> 
>
> Key: YUNIKORN-2954
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2954
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: webapp
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
>
> We need to revert YUNIKORN-2606, as it should never have been merged. It has 
> become clear that it exists only to provide invasive hooks for adding 
> proprietary and/or non-standard components to YuniKorn. It also opens up 
> YuniKorn to potential remote code execution vulnerabilities. This goes 
> against our open development philosophy. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2954) Remove so-called modular sidebar

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2954:
---
Fix Version/s: (was: 1.7.0)

> Remove so-called modular sidebar
> 
>
> Key: YUNIKORN-2954
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2954
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: webapp
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
>
> We need to revert YUNIKORN-2606, as it should never have been merged. It has 
> become clear that it exists only to provide invasive hooks for adding 
> proprietary and/or non-standard components to YuniKorn. It also opens up 
> YuniKorn to potential remote code execution vulnerabilities. This goes 
> against our open development philosophy. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2954) Remove so-called modular sidebar

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2954.

Fix Version/s: 1.7.0
   Resolution: Fixed

Merged to master.

> Remove so-called modular sidebar
> 
>
> Key: YUNIKORN-2954
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2954
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: webapp
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> We need to revert YUNIKORN-2606, as it should never have been merged. It has 
> become clear that it exists only to provide invasive hooks for adding 
> proprietary and/or non-standard components to YuniKorn. It also opens up 
> YuniKorn to potential remote code execution vulnerabilities. This goes 
> against our open development philosophy. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2954) Remove so-called modular sidebar

2024-10-29 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2954:
--

 Summary: Remove so-called modular sidebar
 Key: YUNIKORN-2954
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2954
 Project: Apache YuniKorn
  Issue Type: Task
  Components: webapp
Reporter: Craig Condit
Assignee: Craig Condit


We need to revert YUNIKORN-2606, as it should never have been merged. It has 
become clear that it exists only to provide invasive hooks for adding 
proprietary and/or non-standard components to YuniKorn. It also opens up 
YuniKorn to potential remote code execution vulnerabilities. This goes against 
our open development philosophy. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2908) Remove associated metrics when queue is removed

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2908.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Remove associated metrics when queue is removed
> ---
>
> Key: YUNIKORN-2908
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2908
> Project: Apache YuniKorn
>  Issue Type: Bug
>Reporter: Hengzhe Guo
>Assignee: Hengzhe Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> 1. after a queue is removed, its metrics will continue to be reported by 
> prometheus. This is fine with metrics like allocated resource because they 
> will just be 0, but it won't make sense for guaranteed and max resources, 
> giving wrong impression that there are still resource given to the queue. I 
> propose to unregister all this queue's metrics when it's removed.
> 2. If queue is not removed but guaranteed or max resource config is removed, 
> or just a resource type is removed from the config, the metrics are also not 
> cleaned up. these metrics are only updated when there's a new valid value, 
> but not 'null' value. I propose to always delete all existing guaranteed and 
> max resources metrics of the queue then add back the new values, every time 
> we apply the configs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2908) Remove associated metrics when queue is removed

2024-10-29 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2908:
---
Summary: Remove associated metrics when queue is removed  (was: metrics not 
removed when a queue is removed)

> Remove associated metrics when queue is removed
> ---
>
> Key: YUNIKORN-2908
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2908
> Project: Apache YuniKorn
>  Issue Type: Bug
>Reporter: Hengzhe Guo
>Assignee: Hengzhe Guo
>Priority: Major
>  Labels: pull-request-available
>
> 1. after a queue is removed, its metrics will continue to be reported by 
> prometheus. This is fine with metrics like allocated resource because they 
> will just be 0, but it won't make sense for guaranteed and max resources, 
> giving wrong impression that there are still resource given to the queue. I 
> propose to unregister all this queue's metrics when it's removed.
> 2. If queue is not removed but guaranteed or max resource config is removed, 
> or just a resource type is removed from the config, the metrics are also not 
> cleaned up. these metrics are only updated when there's a new valid value, 
> but not 'null' value. I propose to always delete all existing guaranteed and 
> max resources metrics of the queue then add back the new values, every time 
> we apply the configs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2948) Add MockScheduler test which verifies foreign pod tracking

2024-10-28 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2948.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Add MockScheduler test which verifies foreign pod tracking
> --
>
> Key: YUNIKORN-2948
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2948
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Based on the design docs, we should create a MockScheduler-based unit test in 
> the shim that validates foreign pod tracking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2948) Add MockScheduler test which verifies foreign pod tracking

2024-10-28 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2948:
---
Summary: Add MockScheduler test which verifies foreign pod tracking  (was: 
[shim] Write MockScheduler test which verifies foreign pod tracking)

> Add MockScheduler test which verifies foreign pod tracking
> --
>
> Key: YUNIKORN-2948
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2948
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
>
> Based on the design docs, we should create a MockScheduler-based unit test in 
> the shim that validates foreign pod tracking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2931) Create foreign pod e2e tests

2024-10-25 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2931.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Create foreign pod e2e tests 
> -
>
> Key: YUNIKORN-2931
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2931
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2949) Load external Scheduler Service using Module Federation

2024-10-25 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892835#comment-17892835
 ] 

Craig Condit commented on YUNIKORN-2949:


Also, the so-called "YuniKorn History Service" is appropriating an Apache 
trademark without permission. It cannot be called that, as it gives the 
impression it is an official Apache YuniKorn project.

> Load external Scheduler Service using Module Federation
> ---
>
> Key: YUNIKORN-2949
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2949
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: webapp
>Reporter: Denis Coric
>Assignee: Denis Coric
>Priority: Major
>  Labels: pull-request-available
>
> Add an option to load external Scheduler Service in Applications View using 
> the Module Federation.
> This will only be enabled if the correct env variables are set. If not, the 
> application must behave as is.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2949) Load external Scheduler Service using Module Federation

2024-10-25 Thread Craig Condit (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892834#comment-17892834
 ] 

Craig Condit commented on YUNIKORN-2949:


This has got to stop. We shouldn't be adding hooks for proprietary or 
unsupported third-party hooks into the YuniKorn codebase. If there's meant to 
be an official history service, it should be done under the Apache umbrella. 
I'm a firm -1 on this.

> Load external Scheduler Service using Module Federation
> ---
>
> Key: YUNIKORN-2949
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2949
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: webapp
>Reporter: Denis Coric
>Assignee: Denis Coric
>Priority: Major
>  Labels: pull-request-available
>
> Add an option to load external Scheduler Service in Applications View using 
> the Module Federation.
> This will only be enabled if the correct env variables are set. If not, the 
> application must behave as is.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Closed] (YUNIKORN-2949) Load external Scheduler Service using Module Federation

2024-10-25 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit closed YUNIKORN-2949.
--
Resolution: Won't Do

> Load external Scheduler Service using Module Federation
> ---
>
> Key: YUNIKORN-2949
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2949
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: webapp
>Reporter: Denis Coric
>Assignee: Denis Coric
>Priority: Major
>  Labels: pull-request-available
>
> Add an option to load external Scheduler Service in Applications View using 
> the Module Federation.
> This will only be enabled if the correct env variables are set. If not, the 
> application must behave as is.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2943) Fix typo in Prometheus monitoring guide

2024-10-25 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2943.

 Fix Version/s: 1.7.0
Target Version: 1.7.0
Resolution: Fixed

Merged to master.

> Fix typo in Prometheus monitoring guide
> ---
>
> Key: YUNIKORN-2943
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2943
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tzu-Hua Lan
>Assignee: Tzu-Hua Lan
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Fix a typo in the Prometheus and Grafana monitoring 
> [documentation|https://yunikorn.apache.org/docs/next/user_guide/observability/prometheus#3-use-service-mointor-to-define-monitor-yunikorn-service-target].
> Change:
> - Before: "3. Use Service Mointor to Define monitor yunikorn service target"
> - After: "3. Use Service Monitor to Define monitor yunikorn service target"
> This fixes the misspelling of "Monitor" in the section heading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2941) Remove plugin mode from the install section of Getting Started

2024-10-25 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2941:
---
Summary: Remove plugin mode from the install section of Getting Started  
(was: Remove plugin mode from the install section of Get Started)

> Remove plugin mode from the install section of Getting Started
> --
>
> Key: YUNIKORN-2941
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2941
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: website
>Reporter: Michael Chu
>Assignee: Michael Chu
>Priority: Minor
>  Labels: newbie, pull-request-available
>
> Since plugin mode is now deprecated and will be removed in a future release, 
> it would be better to add a deprecated tag to the plugin mode in the install 
> section to prevent any confusion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2894) Update KubeRay operator documentation for YuniKorn integration

2024-10-25 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2894:
---
Summary: Update KubeRay operator documentation for YuniKorn integration  
(was: [Docs][RayCluster]update KubeRay operator documentation for YuniKorn 
integration)

> Update KubeRay operator documentation for YuniKorn integration
> --
>
> Key: YUNIKORN-2894
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2894
> Project: Apache YuniKorn
>  Issue Type: Improvement
>Reporter: Hsien-Cheng(Ryan) Huang
>Assignee: Hsien-Cheng(Ryan) Huang
>Priority: Major
>  Labels: pull-request-available
>
> kubeRay is now supports gang scheduling via this PR: 
> https://github.com/ray-project/kuberay/pull/2396 
> and is available since its 1.2.0 release: 
> https://github.com/ray-project/kuberay/releases/tag/v1.2.0
> Proposed modifications:
> 1. specify version update to v1.2.2
> 2. document updates based on ray-docs: 
> https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/yunikorn.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



  1   2   3   4   5   6   7   8   9   10   >