[jira] [Updated] (YUNIKORN-2630) Release context lock in shim when processing config in the core

2024-05-16 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YUNIKORN-2630:

Target Version: 1.6.0, 1.5.2

> Release context lock in shim when processing config in the core
> ---
>
> Key: YUNIKORN-2630
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2630
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
>  Labels: pull-request-available
>
> When an change comes in for a the configmaps we process the change under a 
> context lock as we need to merge the two configmaps.
> We keep this lock even if all the work is done in the shim and processing has 
> been transferred to the core. This is unneeded as the core has its own 
> locking an serialisation of the changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2630) Release context lock in shim when processing config in the core

2024-05-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YUNIKORN-2630:
-
Labels: pull-request-available  (was: )

> Release context lock in shim when processing config in the core
> ---
>
> Key: YUNIKORN-2630
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2630
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
>  Labels: pull-request-available
>
> When an change comes in for a the configmaps we process the change under a 
> context lock as we need to merge the two configmaps.
> We keep this lock even if all the work is done in the shim and processing has 
> been transferred to the core. This is unneeded as the core has its own 
> locking an serialisation of the changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2630) Release context lock in shim when processing config in the core

2024-05-16 Thread Wilfred Spiegelenburg (Jira)
Wilfred Spiegelenburg created YUNIKORN-2630:
---

 Summary: Release context lock in shim when processing config in 
the core
 Key: YUNIKORN-2630
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2630
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: shim - kubernetes
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


When an change comes in for a the configmaps we process the change under a 
context lock as we need to merge the two configmaps.

We keep this lock even if all the work is done in the shim and processing has 
been transferred to the core. This is unneeded as the core has its own locking 
an serialisation of the changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2628) fix release announcement links

2024-05-16 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved YUNIKORN-2628.
-
Fix Version/s: 1.6.0
   Resolution: Fixed

links are fixed after removing the {{..}} from the path

> fix release announcement links
> --
>
> Key: YUNIKORN-2628
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2628
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: website
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> In YUNIKORN-2595 a regression snuck in breaking the links to the release 
> announcements.
> Need to reverse that path change for the release announcements.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2629) Adding a node can result in a deadlock

2024-05-16 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847123#comment-17847123
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-2629:
-

I think we need to look at the context lock in the k8shim in general.

The context lock is held while we do none context work. There is no need to 
hold the lock if all we do is waiting for a response that might trigger post 
processing or not.

> Adding a node can result in a deadlock
> --
>
> Key: YUNIKORN-2629
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2629
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Affects Versions: 1.5.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Blocker
>
> Adding a new node after Yunikorn state initialization can result in a 
> deadlock.
> The problem is that {{Context.addNode()}} holds a lock while we're waiting 
> for the {{NodeAccepted}} event:
> {noformat}
>dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
> func(event interface{}) {
>   nodeEvent, ok := event.(CachedSchedulerNodeEvent)
>   if !ok {
>   return
>   }
>   [...] removed for clarity
>   wg.Done()
>   })
>   defer dispatcher.UnregisterEventHandler(handlerID, 
> dispatcher.EventTypeNode)
>   if err := 
> ctx.apiProvider.GetAPIs().SchedulerAPI.UpdateNode({
>   Nodes: nodesToRegister,
>   RmID:  schedulerconf.GetSchedulerConf().ClusterID,
>   }); err != nil {
>   log.Log(log.ShimContext).Error("Failed to register nodes", 
> zap.Error(err))
>   return nil, err
>   }
>   // wait for all responses to accumulate
>   wg.Wait()  <--- shim gets stuck here
>  {noformat}
> If tasks are being processed, then the dispatcher will try to retrieve the 
> evend handler, which is returned from Context:
> {noformat}
> go func() {
>   for {
>   select {
>   case event := <-getDispatcher().eventChan:
>   switch v := event.(type) {
>   case events.TaskEvent:
>   getEventHandler(EventTypeTask)(v)  <--- 
> eventually calls Context.getTask()
>   case events.ApplicationEvent:
>   getEventHandler(EventTypeApp)(v)
>   case events.SchedulerNodeEvent:
>   getEventHandler(EventTypeNode)(v)  
> {noformat}
> Since {{addNode()}} is holding a write lock, the event processing loop gets 
> stuck, so {{registerNodes()}} will never progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-site) branch master updated: [YUNIKORN-2628] revert relative path for release announcement (#430)

2024-05-16 Thread wilfreds
This is an automated email from the ASF dual-hosted git repository.

wilfreds pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/yunikorn-site.git


The following commit(s) were added to refs/heads/master by this push:
 new a29e6105c3 [YUNIKORN-2628] revert relative path for release 
announcement (#430)
a29e6105c3 is described below

commit a29e6105c3e10e8f80d616bf2acf52dbeec81fac
Author: Wilfred Spiegelenburg 
AuthorDate: Fri May 17 11:44:46 2024 +1000

[YUNIKORN-2628] revert relative path for release announcement (#430)

Closes: #430

Signed-off-by: Wilfred Spiegelenburg 
---
 src/pages/community/download.md | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/pages/community/download.md b/src/pages/community/download.md
index e3ae0259e3..683a866263 100644
--- a/src/pages/community/download.md
+++ b/src/pages/community/download.md
@@ -33,11 +33,11 @@ We publish prebuilt docker images for everyone's 
convenience.
 
 The latest release of Apache YuniKorn is v1.5.0.
 
-| Version | Release date | Source download 



  | Docker images   

  [...]
-|-|--|---|--
 [...]
-| v1.5.0  | 2024-03-14   | 
[Download](https://www.apache.org/dyn/closer.lua/yunikorn/1.5.0/apache-yunikorn-1.5.0-src.tar.gz)[Checksum](https://downloads.apache.org/yunikorn/1.5.0/apache-yunikorn-1.5.0-src.tar.gz.sha512)
 & 
[Signature](https://downloads.apache.org/yunikorn/1.5.0/apache-yunikorn-1.5.0-src.tar.gz.asc)
 | 
[scheduler](https://hub.docker.com/layers/apache/yunikorn/scheduler-1.5.0/images/sha256-9cefd0df164b9c4d39f9e10b010eaf7d8f89b130de1648e94f75b9b95d300a00)[admission-
 [...]
-| v1.4.0  | 2023-11-20   | 
[Download](https://archive.apache.org/dist/yunikorn/1.4.0/apache-yunikorn-1.4.0-src.tar.gz)[Checksum](https://archive.apache.org/dist/yunikorn/1.4.0/apache-yunikorn-1.4.0-src.tar.gz.sha512)
 & 
[Signature](https://archive.apache.org/dist/yunikorn/1.4.0/apache-yunikorn-1.4.0-src.tar.gz.asc)
 | 
[scheduler](https://hub.docker.com/layers/apache/yunikorn/scheduler-1.4.0/images/sha256-d013be8e3ad7eb8e51ce23951e6899a4b74088e52c3767f3fcc7efcdcc0904f5)[admission-
 [...]
-| v1.3.0  | 2023-06-12   | 
[Download](https://archive.apache.org/dist/yunikorn/1.3.0/apache-yunikorn-1.3.0-src.tar.gz)[Checksum](https://archive.apache.org/dist/yunikorn/1.3.0/apache-yunikorn-1.3.0-src.tar.gz.sha512)
 & 
[Signature](https://archive.apache.org/dist/yunikorn/1.3.0/apache-yunikorn-1.3.0-src.tar.gz.asc)
 | 
[scheduler](https://hub.docker.com/layers/apache/yunikorn/scheduler-1.3.0/images/sha256-99a1973728c6684b1da7631dbf015daa1dbf519dbab1ffc8b23fccdfa7ffd0c5)[admission-
 [...]
+| Version | Release date | Source download 



  | Docker images   

  [...]
+|-|--|---|--
 [...]
+| v1.5.0  | 2024-03-14   | 
[Download](https://www.apache.org/dyn/closer.lua/yunikorn/1.5.0/apache-yunikorn-1.5.0-src.tar.gz)[Checksum](https://downloads.apache.org/yunikorn/1.5.0/apache-yunikorn-1.5.0-src.tar.gz.sha512)
 & 
[Signature](https://downloads.apache.org/yunikorn/1.5.0/apache-yunikorn-1.5.0-src.tar.gz.asc)
 | 

[jira] [Updated] (YUNIKORN-2629) Adding a node can result in a deadlock

2024-05-16 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-2629:
---
Description: 
Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
   dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
func(event interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
if err := 
ctx.apiProvider.GetAPIs().SchedulerAPI.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}
Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck, so {{registerNodes()}} will never progress.

  was:
Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
   dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
func(event interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
api := ctx.apiProvider.GetAPIs().SchedulerAPI
if err := api.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}
Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck, so {{registerNodes()}} will never progress.


> Adding a node can result in a deadlock
> --
>
> Key: YUNIKORN-2629
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2629
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Affects Versions: 1.5.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Blocker
>
> Adding a new node after Yunikorn state initialization can result in a 
> deadlock.
> The problem is that {{Context.addNode()}} holds a lock while we're waiting 
> for the {{NodeAccepted}} event:
> {noformat}
>dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
> func(event interface{}) {
>   nodeEvent, ok := event.(CachedSchedulerNodeEvent)
>  

[jira] [Updated] (YUNIKORN-2629) Adding a node can result in a deadlock

2024-05-16 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-2629:
---
Description: 
Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
   dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
func(event interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
api := ctx.apiProvider.GetAPIs().SchedulerAPI
if err := api.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}
Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck, so {{registerNodes()}} will never progress.

  was:
Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
   dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
func(event interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
api := ctx.apiProvider.GetAPIs().SchedulerAPI
if err := api.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}

Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck.


> Adding a node can result in a deadlock
> --
>
> Key: YUNIKORN-2629
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2629
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Affects Versions: 1.5.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Blocker
>
> Adding a new node after Yunikorn state initialization can result in a 
> deadlock.
> The problem is that {{Context.addNode()}} holds a lock while we're waiting 
> for the {{NodeAccepted}} event:
> {noformat}
>dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
> func(event interface{}) {
>   nodeEvent, ok := event.(CachedSchedulerNodeEvent)
>   if !ok {
>  

[jira] [Updated] (YUNIKORN-2629) Adding a node can result in a deadlock

2024-05-16 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-2629:
---
Affects Version/s: 1.5.0

> Adding a node can result in a deadlock
> --
>
> Key: YUNIKORN-2629
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2629
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Affects Versions: 1.5.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Blocker
>
> Adding a new node after Yunikorn state initialization can result in a 
> deadlock.
> The problem is that {{Context.addNode()}} holds a lock while we're waiting 
> for the {{NodeAccepted}} event:
> {noformat}
>dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
> func(event interface{}) {
>   nodeEvent, ok := event.(CachedSchedulerNodeEvent)
>   if !ok {
>   return
>   }
>   [...] removed for clarity
>   wg.Done()
>   })
>   defer dispatcher.UnregisterEventHandler(handlerID, 
> dispatcher.EventTypeNode)
>   api := ctx.apiProvider.GetAPIs().SchedulerAPI
>   if err := api.UpdateNode({
>   Nodes: nodesToRegister,
>   RmID:  schedulerconf.GetSchedulerConf().ClusterID,
>   }); err != nil {
>   log.Log(log.ShimContext).Error("Failed to register nodes", 
> zap.Error(err))
>   return nil, err
>   }
>   // wait for all responses to accumulate
>   wg.Wait()  <--- shim gets stuck here
>  {noformat}
> If tasks are being processed, then the dispatcher will try to retrieve the 
> evend handler, which is returned from Context:
> {noformat}
> go func() {
>   for {
>   select {
>   case event := <-getDispatcher().eventChan:
>   switch v := event.(type) {
>   case events.TaskEvent:
>   getEventHandler(EventTypeTask)(v)  <--- 
> eventually calls Context.getTask()
>   case events.ApplicationEvent:
>   getEventHandler(EventTypeApp)(v)
>   case events.SchedulerNodeEvent:
>   getEventHandler(EventTypeNode)(v)  
> {noformat}
> Since {{addNode()}} is holding a write lock, the event processing loop gets 
> stuck.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2629) Adding a node can result in a deadlock

2024-05-16 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-2629:
---
Description: 
Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
   dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
func(event interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
api := ctx.apiProvider.GetAPIs().SchedulerAPI
if err := api.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}

Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck.

  was:
Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, func(event 
interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
api := ctx.apiProvider.GetAPIs().SchedulerAPI
if err := api.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}

Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck.


> Adding a node can result in a deadlock
> --
>
> Key: YUNIKORN-2629
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2629
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Blocker
>
> Adding a new node after Yunikorn state initialization can result in a 
> deadlock.
> The problem is that {{Context.addNode()}} holds a lock while we're waiting 
> for the {{NodeAccepted}} event:
> {noformat}
>dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, 
> func(event interface{}) {
>   nodeEvent, ok := event.(CachedSchedulerNodeEvent)
>   if !ok {
>   return
>   }
>   [...] removed 

[jira] [Created] (YUNIKORN-2629) Adding a node can result in a deadlock

2024-05-16 Thread Peter Bacsko (Jira)
Peter Bacsko created YUNIKORN-2629:
--

 Summary: Adding a node can result in a deadlock
 Key: YUNIKORN-2629
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2629
 Project: Apache YuniKorn
  Issue Type: Bug
  Components: shim - kubernetes
Reporter: Peter Bacsko
Assignee: Peter Bacsko


Adding a new node after Yunikorn state initialization can result in a deadlock.

The problem is that {{Context.addNode()}} holds a lock while we're waiting for 
the {{NodeAccepted}} event:
{noformat}
dispatcher.RegisterEventHandler(handlerID, dispatcher.EventTypeNode, func(event 
interface{}) {
nodeEvent, ok := event.(CachedSchedulerNodeEvent)
if !ok {
return
}
[...] removed for clarity
wg.Done()
})
defer dispatcher.UnregisterEventHandler(handlerID, 
dispatcher.EventTypeNode)
api := ctx.apiProvider.GetAPIs().SchedulerAPI
if err := api.UpdateNode({
Nodes: nodesToRegister,
RmID:  schedulerconf.GetSchedulerConf().ClusterID,
}); err != nil {
log.Log(log.ShimContext).Error("Failed to register nodes", 
zap.Error(err))
return nil, err
}

// wait for all responses to accumulate
wg.Wait()  <--- shim gets stuck here
 {noformat}
If tasks are being processed, then the dispatcher will try to retrieve the 
evend handler, which is returned from Context:
{noformat}
go func() {
for {
select {
case event := <-getDispatcher().eventChan:
switch v := event.(type) {
case events.TaskEvent:
getEventHandler(EventTypeTask)(v)  <--- 
eventually calls Context.getTask()
case events.ApplicationEvent:
getEventHandler(EventTypeApp)(v)
case events.SchedulerNodeEvent:
getEventHandler(EventTypeNode)(v)  
{noformat}

Since {{addNode()}} is holding a write lock, the event processing loop gets 
stuck.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-release) annotated tag v1.5.1 updated (ab98307 -> 758038d)

2024-05-16 Thread pbacsko
This is an automated email from the ASF dual-hosted git repository.

pbacsko pushed a change to annotated tag v1.5.1
in repository https://gitbox.apache.org/repos/asf/yunikorn-release.git


*** WARNING: tag v1.5.1 was modified! ***

from ab98307  (commit)
  to 758038d  (tag)
 tagging ab9830736c1617c8a765a113ccd58605e020f8f7 (commit)
 replaces v1.5.0
  by Peter Bacsko
  on Thu May 16 13:21:03 2024 +0200

- Log -
Apache YuniKorn v1.5.1
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-release) branch branch-1.5 updated: Update CHANGELOG for 1.5.1

2024-05-16 Thread pbacsko
This is an automated email from the ASF dual-hosted git repository.

pbacsko pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/yunikorn-release.git


The following commit(s) were added to refs/heads/branch-1.5 by this push:
 new ab98307   Update CHANGELOG for 1.5.1
ab98307 is described below

commit ab9830736c1617c8a765a113ccd58605e020f8f7
Author: Peter Bacsko 
AuthorDate: Wed May 8 15:13:27 2024 +0200

 Update CHANGELOG for 1.5.1
---
 release-top-level-artifacts/CHANGELOG | 238 +++---
 1 file changed, 19 insertions(+), 219 deletions(-)

diff --git a/release-top-level-artifacts/CHANGELOG 
b/release-top-level-artifacts/CHANGELOG
index 07b6a6e..52983c7 100644
--- a/release-top-level-artifacts/CHANGELOG
+++ b/release-top-level-artifacts/CHANGELOG
@@ -16,237 +16,37 @@
 #
 
 
-Release Notes - Apache YuniKorn - Version 1.5.0
+Release Notes - Apache YuniKorn - Version 1.5.1
 
 ** Sub-task
-* [YUNIKORN-1709] - Add event streaming logic
-* [YUNIKORN-1950] - Improving test coverage for whole user/group 
enforcement feature - Phase 2
-* [YUNIKORN-1956] - Add wildcard user/group limit e2e tests
-* [YUNIKORN-2037] - Document the performance using kwok
-* [YUNIKORN-2089] - Move usedResource type and tests to their own files
-* [YUNIKORN-2116] - Track user/group events
-* [YUNIKORN-2118] - Add smoke test for event streaming
-* [YUNIKORN-2119] - Add check for parent queue user/group limit lower than 
child queue
-* [YUNIKORN-2132] - Show active event streaming in the state dump
-* [YUNIKORN-2136] - limit max resource should be greater than zero
-* [YUNIKORN-2145] - refactor: ApplicationSummary into its own file
-* [YUNIKORN-2147] - Limit the number of concurrent event streams
-* [YUNIKORN-2151] - Report resource used by placeholder pods in the app 
summary
-* [YUNIKORN-2159] - Clean up AppManager implementation
-* [YUNIKORN-2163] - Fix HTTP status codes in some REST handlers
-* [YUNIKORN-2164] - Use ParseUint instead of ParseInt in getEvents()
-* [YUNIKORN-2175] - Add queue headRoom for  Rest API querying and improve 
logs
-* [YUNIKORN-2176] - Add test for user & group max resource changes
-* [YUNIKORN-2180] - Clean up scheduler state initialization
-* [YUNIKORN-2188] - Improve state transition event to include the eventinfo
-* [YUNIKORN-2201] - Evaluate the performance impact of Headroom() and 
CanRunApp()
-* [YUNIKORN-2203] - Possible log spew in UGM code
-* [YUNIKORN-2205] - remove the warning of processing nonexistent 
"namespace.guaranteed"
-* [YUNIKORN-2209] - Remove limit checks in QueueTracker
-* [YUNIKORN-2210] - Metrics: use WithLabelValues instead of With
-* [YUNIKORN-2212] - Don't collect requests that hasn't been scheduled yet 
or already triggered scale up
-* [YUNIKORN-2231] - Show node list when hovering mouse over the node 
utitutilization bar chart
-* [YUNIKORN-2257] - Add rest API to retrieve node utilization for multiple 
resource types
-* [YUNIKORN-2264] - Add missing Originator and PreemptionPolicy fields to 
SI Allocation
-* [YUNIKORN-2265] - Populate Originator and PreemptionPolicy on existing 
allocations
-* [YUNIKORN-2284] - ERROR message when stopping Service context
-* [YUNIKORN-2285] - Don't re-calculate reservationKey
-* [YUNIKORN-2292] - Flaky E2E Test: Orphan pods still exist after 
TearDownNamespace()
-* [YUNIKORN-2293] - Flaky E2E Test: Failed asserts in 
LogTestClusterInfoWrapper() blocked the resources cleanup steps
-* [YUNIKORN-2294] - Flaky E2E Test: "Verify_Hard_GS_Failed_State" polling 
short-lived "Failing" application status
-* [YUNIKORN-2309] - Add pod status updater logic to the MockScheduler 
performance test
-* [YUNIKORN-2312] - Cleanup BinPacking e2e test workload before removing 
namespace
-* [YUNIKORN-2313] - Flaky E2E Test:  "Verify_basic_preemption" experiences 
flakiness due to race condition
-* [YUNIKORN-2316] - Update REST API docs for 
/ws/v1/scheduler/node-utilizations
-* [YUNIKORN-2325] - Add a chart to display multi-type resource utilisation 
(Web)
-* [YUNIKORN-2335] - Use go standard library min and max functions
-* [YUNIKORN-2337] - Update documentation about event streaming
-* [YUNIKORN-2339] - Remove Nodes Utilisation chart from Dashboard page 
(Web)
-* [YUNIKORN-2366] - Shim: Update GetPodResources() to handle in-place pod 
resource updates
-* [YUNIKORN-2370] - Proper event handling for failed headroom checks
-* [YUNIKORN-2373] - Extend EventRecord type with user/group related data
-* [YUNIKORN-2379] - Adjust layout of node utilization chart(Web)
-* [YUNIKORN-2381] - Update the copyright years in NOTICE files to 2024
-* [YUNIKORN-2382] - Expose K8s supported versions on web
-* [YUNIKORN-2390] - Improve mousehover result for node utilization 
chart(Web)
-* [YUNIKORN-2395] - Remove Jaeger 

[jira] [Resolved] (YUNIKORN-2612) Tagging for 1.5.1

2024-05-16 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko resolved YUNIKORN-2612.

Fix Version/s: 1.5.1
   Resolution: Fixed

> Tagging for 1.5.1
> -
>
> Key: YUNIKORN-2612
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2612
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: release
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 1.5.1
>
>
> Tagging for updating dependencies (SI/core/k8shim).
> No branching is needed because we'll deliver the release from branch-1.5 
> directly as we did with incubator minor releases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2602) Fix spelling/grammar in configvalidator

2024-05-16 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved YUNIKORN-2602.
--
Fix Version/s: 1.6.0
   Resolution: Fixed

> Fix spelling/grammar in configvalidator
> ---
>
> Key: YUNIKORN-2602
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2602
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - common
>Reporter: Peter Bacsko
>Assignee: Yun Sun
>Priority: Trivial
>  Labels: newbie, pull-request-available
> Fix For: 1.6.0
>
>
> Let's fix some minor grammar issues in configvalidator.go.
> Eg.: "existed" -> "existing", but there could be other mistakes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-core) branch master updated: [YUNIKORN-2602] Fix spelling/grammar in configvalidator.go (#869)

2024-05-16 Thread chia7712
This is an automated email from the ASF dual-hosted git repository.

chia7712 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/yunikorn-core.git


The following commit(s) were added to refs/heads/master by this push:
 new 9d7fddf9 [YUNIKORN-2602] Fix spelling/grammar in configvalidator.go 
(#869)
9d7fddf9 is described below

commit 9d7fddf9a3618ec449e7ca974461d5a3745ac49f
Author: YUN SUN 
AuthorDate: Thu May 16 18:05:28 2024 +0800

[YUNIKORN-2602] Fix spelling/grammar in configvalidator.go (#869)

Closes: #869

Signed-off-by: Chia-Ping Tsai 
---
 pkg/common/configs/configvalidator.go | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/pkg/common/configs/configvalidator.go 
b/pkg/common/configs/configvalidator.go
index ec8e15a3..390b39b0 100644
--- a/pkg/common/configs/configvalidator.go
+++ b/pkg/common/configs/configvalidator.go
@@ -478,7 +478,7 @@ func checkPlacementFilter(filter Filter) error {
 }
 
 // Check a single limit entry
-func checkLimit(limit Limit, existedUserName map[string]bool, existedGroupName 
map[string]bool, queue *QueueConfig) error {
+func checkLimit(limit Limit, existingUserName map[string]bool, 
existingGroupName map[string]bool, queue *QueueConfig) error {
if len(limit.Users) == 0 && len(limit.Groups) == 0 {
return fmt.Errorf("empty user and group lists defined in limit 
'%v'", limit)
}
@@ -488,15 +488,15 @@ func checkLimit(limit Limit, existedUserName 
map[string]bool, existedGroupName m
return fmt.Errorf("invalid limit user name '%s' in 
limit definition", name)
}
 
-   if existedUserName[name] {
+   if existingUserName[name] {
return fmt.Errorf("duplicated user name '%s', already 
exists", name)
}
-   existedUserName[name] = true
+   existingUserName[name] = true
 
// The user without wildcard should not happen after the 
wildcard user
// It means the wildcard for user should be the last item for 
limits object list which including the username,
// and we should only set one wildcard user for all limits
-   if existedUserName["*"] && name != "*" {
+   if existingUserName["*"] && name != "*" {
return fmt.Errorf("should not set no wildcard user %s 
after wildcard user limit", name)
}
}
@@ -505,15 +505,15 @@ func checkLimit(limit Limit, existedUserName 
map[string]bool, existedGroupName m
return fmt.Errorf("invalid limit group name '%s' in 
limit definition", name)
}
 
-   if existedGroupName[name] {
-   return fmt.Errorf("duplicated group name '%s' , already 
existed", name)
+   if existingGroupName[name] {
+   return fmt.Errorf("duplicated group name '%s'", name)
}
-   existedGroupName[name] = true
+   existingGroupName[name] = true
 
// The group without wildcard should not happen after the 
wildcard group
// It means the wildcard for group should be the last item for 
limits object list which including the group name,
// and we should only set one wildcard group for all limits
-   if existedGroupName["*"] && name != "*" {
+   if existingGroupName["*"] && name != "*" {
return fmt.Errorf("should not set no wildcard group %s 
after wildcard group limit", name)
}
}
@@ -522,7 +522,7 @@ func checkLimit(limit Limit, existedUserName 
map[string]bool, existedGroupName m
// If there is no specific group mentioned the wildcard group limit 
would thus be the same as the queue limit.
// For that reason we do not allow specifying only one group limit that 
is using the wildcard.
// There must be at least one limit with a group name defined.
-   if existedGroupName["*"] && len(existedGroupName) == 1 {
+   if existingGroupName["*"] && len(existingGroupName) == 1 {
return fmt.Errorf("should not specify only one group limit that 
is using the wildcard. " +
"There must be at least one limit with a group name 
defined ")
}
@@ -578,11 +578,11 @@ func checkLimits(limits []Limit, obj string, queue 
*QueueConfig) error {
zap.String("objName", obj),
zap.Int("limitsLength", len(limits)))
 
-   existedUserName := make(map[string]bool)
-   existedGroupName := make(map[string]bool)
+   existingUserName := make(map[string]bool)
+   existingGroupName := make(map[string]bool)
 
for _, limit := range limits {
-   if err := checkLimit(limit, existedUserName, existedGroupName, 
queue); err != nil {
+   if err := 

(yunikorn-web) annotated tag v1.5.1 updated (db71be7 -> 75d2434)

2024-05-16 Thread pbacsko
This is an automated email from the ASF dual-hosted git repository.

pbacsko pushed a change to annotated tag v1.5.1
in repository https://gitbox.apache.org/repos/asf/yunikorn-web.git


*** WARNING: tag v1.5.1 was modified! ***

from db71be7  (commit)
  to 75d2434  (tag)
 tagging db71be72bae18e08b4264d06d781a841503aa283 (commit)
 replaces v1.4.0-1
  by Peter Bacsko
  on Thu May 16 11:56:36 2024 +0200

- Log -
Apache YuniKorn v1.5.1
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-scheduler-interface) annotated tag v1.5.1 updated (3ad69cd -> 1628488)

2024-05-16 Thread pbacsko
This is an automated email from the ASF dual-hosted git repository.

pbacsko pushed a change to annotated tag v1.5.1
in repository 
https://gitbox.apache.org/repos/asf/yunikorn-scheduler-interface.git


*** WARNING: tag v1.5.1 was modified! ***

from 3ad69cd  (commit)
  to 1628488  (tag)
 tagging 3ad69cdbc247cb5ce54acff16e06331ed95cba8c (commit)
 replaces v1.5.0
  by Peter Bacsko
  on Thu May 16 11:56:12 2024 +0200

- Log -
Apache YuniKorn v1.5.1
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-k8shim) annotated tag v1.5.1 updated (207e4031 -> d23c5134)

2024-05-16 Thread pbacsko
This is an automated email from the ASF dual-hosted git repository.

pbacsko pushed a change to annotated tag v1.5.1
in repository https://gitbox.apache.org/repos/asf/yunikorn-k8shim.git


*** WARNING: tag v1.5.1 was modified! ***

from 207e4031 (commit)
  to d23c5134 (tag)
 tagging 207e4031c6484c965fca4018b6b8176afc5956b4 (commit)
 replaces v1.5.0
  by Peter Bacsko
  on Thu May 16 11:55:49 2024 +0200

- Log -
Apache YuniKorn v1.5.1
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-core) annotated tag v1.5.1 updated (4856bc8d -> f121aeaf)

2024-05-16 Thread pbacsko
This is an automated email from the ASF dual-hosted git repository.

pbacsko pushed a change to annotated tag v1.5.1
in repository https://gitbox.apache.org/repos/asf/yunikorn-core.git


*** WARNING: tag v1.5.1 was modified! ***

from 4856bc8d (commit)
  to f121aeaf (tag)
 tagging 4856bc8d7d7bc41f6640e306435eeb885eee8a3f (commit)
 replaces v1.5.0
  by Peter Bacsko
  on Thu May 16 11:54:57 2024 +0200

- Log -
Apache YuniKorn v1.5.1
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2627) Add K8s 1.30 to the e2e matrix

2024-05-16 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved YUNIKORN-2627.
-
Fix Version/s: 1.6.0
   Resolution: Fixed

Upgrdaed kind to version 0.23 and added 1.30 as a new version to test with

> Add K8s 1.30 to the e2e matrix
> --
>
> Key: YUNIKORN-2627
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2627
> Project: Apache YuniKorn
>  Issue Type: Improvement
>Reporter: Wilfred Spiegelenburg
>Assignee: Tseng Hsi-Huang
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 1.6.0
>
>
> k8s 1.30 support in kind is now available as part of the [0.23 
> release|https://github.com/kubernetes-sigs/kind/releases/tag/v0.23.0]
> Need to add 1.30 to the matrix for the next release



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



(yunikorn-k8shim) branch master updated: [YUNIKORN-2627] Add K8s 1.30 to the e2e matrix (#840)

2024-05-16 Thread wilfreds
This is an automated email from the ASF dual-hosted git repository.

wilfreds pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/yunikorn-k8shim.git


The following commit(s) were added to refs/heads/master by this push:
 new 5f80f49b [YUNIKORN-2627] Add K8s 1.30 to the e2e matrix (#840)
5f80f49b is described below

commit 5f80f49b2ee5acb3432b2d5534dbe7f3d3bcc2fc
Author: Tseng Hsi-Huang <9501...@gmail.com>
AuthorDate: Thu May 16 17:28:41 2024 +1000

[YUNIKORN-2627] Add K8s 1.30 to the e2e matrix (#840)

Closes: #840

Signed-off-by: Wilfred Spiegelenburg 
---
 .github/workflows/pre-commit.yml | 2 +-
 Makefile | 2 +-
 scripts/run-e2e-tests.sh | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml
index 4131fde9..afed3906 100644
--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@@ -43,7 +43,7 @@ jobs:
 strategy:
   fail-fast: false
   matrix:
-k8s: [v1.29.2, v1.28.7, v1.27.11, v1.26.14, v1.25.16, v1.24.17]
+k8s: [v1.30.0, v1.29.2, v1.28.7, v1.27.11, v1.26.14, v1.25.16, 
v1.24.17]
 plugin: ['', '--plugin']
 steps:
   - name: Checkout source code
diff --git a/Makefile b/Makefile
index 50b3a659..76dd81fc 100644
--- a/Makefile
+++ b/Makefile
@@ -155,7 +155,7 @@ KUBECTL_VERSION=v1.27.7
 KUBECTL_BIN=$(TOOLS_DIR)/kubectl
 
 # kind
-KIND_VERSION=v0.20.0
+KIND_VERSION=v0.23.0
 KIND_BIN=$(TOOLS_DIR)/kind
 
 # helm
diff --git a/scripts/run-e2e-tests.sh b/scripts/run-e2e-tests.sh
index 07073c4e..02c21ec7 100755
--- a/scripts/run-e2e-tests.sh
+++ b/scripts/run-e2e-tests.sh
@@ -164,9 +164,10 @@ Examples:
   ${NAME} -a test -n yk8s -v kindest/node:v1.27.11
   ${NAME} -a test -n yk8s -v kindest/node:v1.28.7
   ${NAME} -a test -n yk8s -v kindest/node:v1.29.2
+  ${NAME} -a test -n yk8s -v kindest/node:v1.30.0
 
   Use a local helm chart path:
-${NAME} -a test -n yk8s -v kindest/node:v1.29.2 -p 
../yunikorn-release/helm-charts/yunikorn
+${NAME} -a test -n yk8s -v kindest/node:v1.30.0 -p 
../yunikorn-release/helm-charts/yunikorn
 EOF
 }
 


-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-2616) Remove unused bool return from PreemptionPredicates()

2024-05-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YUNIKORN-2616:
-
Labels: pull-request-available  (was: )

> Remove unused bool return from PreemptionPredicates()
> -
>
> Key: YUNIKORN-2616
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2616
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Wilfred Spiegelenburg
>Assignee: Hsien-Cheng(Ryan) Huang
>Priority: Trivial
>  Labels: pull-request-available
>
> The predicate manager method {{PreemptionPredicates()}} returns two values an 
> int and boolean. The boolean is false if the integer is -1 and true for 0 or 
> llarger. There is no need for the boolean as the -1 already indicates the same



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-2626) Add flag to helm chart to disable web container

2024-05-16 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846827#comment-17846827
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-2626:
-

I have no strong feelings either way. The default should be the web container 
on but that is it.

Create a PR to make it possible: charts are 
[here|https://github.com/wilfred-s/yunikorn-release/tree/master/helm-charts/yunikorn]

> Add flag to helm chart to disable web container
> ---
>
> Key: YUNIKORN-2626
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2626
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: deployment
>Reporter: Michael
>Priority: Major
>
> For our use case we only really need the admission controller and scheduler. 
> The helm chart does currently not provide a way to disable deploying the web 
> container and it would be great if that is possible.
> Is there any reason not to disable the web container?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org