[jira] [Updated] (YUNIKORN-185) K8shim - logging cleanup

2020-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YUNIKORN-185:

Labels: pull-request-available  (was: )

> K8shim - logging cleanup
> 
>
> Key: YUNIKORN-185
> URL: https://issues.apache.org/jira/browse/YUNIKORN-185
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
>
> Cleanup some logs at the shim



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Assigned] (YUNIKORN-183) Logging clean up

2020-05-22 Thread Weiwei Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YUNIKORN-183:


Assignee: Weiwei Yang

> Logging clean up
> 
>
> Key: YUNIKORN-183
> URL: https://issues.apache.org/jira/browse/YUNIKORN-183
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: core - scheduler, shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>
> Today, we are logging too many things at the DEBUG level. That makes the 
> issue tracking super difficult. We need to do some cleanup at some critical 
> paths. A few things, we need to look at
>  # Reduce, or minimize the logs at core scheduling cycle (this is going 
> overwhelming)
>  # Keep tracking of main things happening, allocation, release, reservation, 
> preservation etc
> this is a parent, we need to create sub tasks for core and shim.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-185) K8shim - logging cleanup

2020-05-22 Thread Weiwei Yang (Jira)
Weiwei Yang created YUNIKORN-185:


 Summary: K8shim - logging cleanup
 Key: YUNIKORN-185
 URL: https://issues.apache.org/jira/browse/YUNIKORN-185
 Project: Apache YuniKorn
  Issue Type: Sub-task
  Components: shim - kubernetes
Reporter: Weiwei Yang
Assignee: Weiwei Yang


Cleanup some logs at the shim



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-184) Update YuniKorn-Core Design Documentation

2020-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YUNIKORN-184:

Labels: pull-request-available  (was: )

> Update YuniKorn-Core Design Documentation
> -
>
> Key: YUNIKORN-184
> URL: https://issues.apache.org/jira/browse/YUNIKORN-184
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: documentation
>Reporter: Wangda Tan
>Priority: Major
>  Labels: pull-request-available
>
> The original design doc is pretty out-of-dated, it was done before the 
> yunikorn-core is completed. A lot of things changed after that, we need to 
> refresh the design doc so new members can easier understand structure inside 
> yunikorn-core.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-184) Update YuniKorn-Core Design Documentation

2020-05-22 Thread Wangda Tan (Jira)
Wangda Tan created YUNIKORN-184:
---

 Summary: Update YuniKorn-Core Design Documentation
 Key: YUNIKORN-184
 URL: https://issues.apache.org/jira/browse/YUNIKORN-184
 Project: Apache YuniKorn
  Issue Type: Task
  Components: documentation
Reporter: Wangda Tan


The original design doc is pretty out-of-dated, it was done before the 
yunikorn-core is completed. A lot of things changed after that, we need to 
refresh the design doc so new members can easier understand structure inside 
yunikorn-core.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-183) Logging clean up

2020-05-22 Thread Weiwei Yang (Jira)
Weiwei Yang created YUNIKORN-183:


 Summary: Logging clean up
 Key: YUNIKORN-183
 URL: https://issues.apache.org/jira/browse/YUNIKORN-183
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: core - scheduler, shim - kubernetes
Reporter: Weiwei Yang


Today, we are logging too many things at the DEBUG level. That makes the issue 
tracking super difficult. We need to do some cleanup at some critical paths. A 
few things, we need to look at
 # Reduce, or minimize the logs at core scheduling cycle (this is going 
overwhelming)
 # Keep tracking of main things happening, allocation, release, reservation, 
preservation etc

this is a parent, we need to create sub tasks for core and shim.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-173) Generates one default application ID per namespace in the admission controller

2020-05-22 Thread Weiwei Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114426#comment-17114426
 ] 

Weiwei Yang commented on YUNIKORN-173:
--

hi [~wangda]

I should have mentioned this somewhere. The behavior is the same, we always 
group these unnamed pods to one app per namespace.

This implies
 # unnamed pods in one namespace can NOT be submitted to different queues
 # unnamed pods can run in several apps if their namespaces are mapping to one 
queue

Which equals to
 # 1 namespace to 1 queue mapping : Supported
 # N namespace to 1 queue mapping: Supported
 # 1 namespace to N queue mapping: Not Supported (may not be a valid case 
anyway)

If another placement rule is used, we will need to ensure that rule won't do 
the placement like the case #3.

 

> Generates one default application ID per namespace in the admission controller
> --
>
> Key: YUNIKORN-173
> URL: https://issues.apache.org/jira/browse/YUNIKORN-173
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> If app doesn't explicitly specify application ID,  lets group such pods to 
> one single app per namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-173) Generates one default application ID per namespace in the admission controller

2020-05-22 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114422#comment-17114422
 ] 

Wangda Tan commented on YUNIKORN-173:
-

[~wwei], [~wilfreds],  

What the behavior when namespace -> queue creation/mapping is disabled? 

> Generates one default application ID per namespace in the admission controller
> --
>
> Key: YUNIKORN-173
> URL: https://issues.apache.org/jira/browse/YUNIKORN-173
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> If app doesn't explicitly specify application ID,  lets group such pods to 
> one single app per namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-158) Admission controller deployment file should use the same version as the scheduler

2020-05-22 Thread Weiwei Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114399#comment-17114399
 ] 

Weiwei Yang commented on YUNIKORN-158:
--

Hi [~wilfreds] 

Thanks! I've tested this, looks like when I submitted the patch, I appended the 
--build-args to the docker build script for admission-controller image instead 
of the scheduler image. I just submitted another patch for this. 

 

> Admission controller deployment file should use the same version as the 
> scheduler
> -
>
> Key: YUNIKORN-158
> URL: https://issues.apache.org/jira/browse/YUNIKORN-158
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
>
> Today, the admission controller is deployed as a post-start hook in the 
> scheduler pod, and the template file has hard coded docker image name, 
> [https://github.com/apache/incubator-yunikorn-k8shim/blob/b8a4a01fa1f6149c8617c914a721296a71037736/deployments/admission-controllers/scheduler/templates/server.yaml.template#L33.]
>  We should get this fixed and make sure the admission controller version is 
> always aligned with the scheduler. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-117) Create event cache for queue and application events

2020-05-22 Thread Weiwei Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114379#comment-17114379
 ] 

Weiwei Yang commented on YUNIKORN-117:
--

hi [~adam.antal]

Correct, currently, we are publishing all the events related to predicate 
failures to the event system, that's why you see this.

For the event system you are building, we will leverage that to publish more 
compact/meaningful events.

I know this is current WIP, but I am still wondering why the \{{Message}} is 
empty there? And also, we might want to keep the \{{Reason}} field tight.

> Create event cache for queue and application events
> ---
>
> Key: YUNIKORN-117
> URL: https://issues.apache.org/jira/browse/YUNIKORN-117
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - cache, core - scheduler
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Critical
>  Labels: pull-request-available
>
> Create a simple preliminary implementation of the event cache of YUNIKORN-42.
> We have the following limited scope for this task:
> - implement it as a separate process from the scheduler (similar to 
> {{PartitionManager}})
> - only deal with queues and applications (the pods and nodes can be added 
> later)
> - only store the apps last visited time from the scheduler
> - clean up those objects that haven't been visited in the last 24h
> Other cache implementations can be also considered.
> As a starting point, channels are a safe choice to have async communication 
> with the scheduler without expecting bigger performance loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-155) data race in unit test: TestSchedulerRecoveryWhenPlacementRulesApplied

2020-05-22 Thread Weiwei Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114372#comment-17114372
 ] 

Weiwei Yang commented on YUNIKORN-155:
--

Awesome, thanks [~kmarton], [~adam.antal]!

This is a good practice, I think we might need to implement more String() API 
for our structs.

The change looks good to me, but the UT pops up a new failure. We need a 
separate Jira to track that.

 

> data race in unit test: TestSchedulerRecoveryWhenPlacementRulesApplied
> --
>
> Key: YUNIKORN-155
> URL: https://issues.apache.org/jira/browse/YUNIKORN-155
> Project: Apache YuniKorn
>  Issue Type: Test
>  Components: test - unit
>Reporter: Wilfred Spiegelenburg
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
> Attachments: data_race.txt
>
>
> Testing shows a new data race while logging the queue name for an application 
> that gets added.
> Details in the attached logs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-117) Create event cache for queue and application events

2020-05-22 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114141#comment-17114141
 ] 

Adam Antal commented on YUNIKORN-117:
-

I made some progress. The current status is to handle too many event emitted by 
the core to kubernetes each time at tryAllocate.
If we would just simply flush everything we could have these kind of {{kubectl 
describe}} outputs:
{noformat}
Events:
  Type Reason  Age   From   
   Message
   --       
   ---
  Normal   Scheduling  5m11s 
yunikorn  default/task0 is queued and waiting for allocation
  Warning  FailedScheduling5m11s (x24 over 5m11s)
yunikorn  [Predicate NodeUnknownCondition failed]
  Normal   Pending resources would not fit in  12s (x136548 over 4m47s)  
yunikorn
{noformat}
This should be avoided.

> Create event cache for queue and application events
> ---
>
> Key: YUNIKORN-117
> URL: https://issues.apache.org/jira/browse/YUNIKORN-117
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: core - cache, core - scheduler
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Critical
>  Labels: pull-request-available
>
> Create a simple preliminary implementation of the event cache of YUNIKORN-42.
> We have the following limited scope for this task:
> - implement it as a separate process from the scheduler (similar to 
> {{PartitionManager}})
> - only deal with queues and applications (the pods and nodes can be added 
> later)
> - only store the apps last visited time from the scheduler
> - clean up those objects that haven't been visited in the last 24h
> Other cache implementations can be also considered.
> As a starting point, channels are a safe choice to have async communication 
> with the scheduler without expecting bigger performance loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114130#comment-17114130
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
---

code changes committed, leaving open to add documentation

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YUNIKORN-99:
--
Fix Version/s: 0.9

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-181) Implement a FAIR version of the stateaware

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YUNIKORN-181:
---
Issue Type: New Feature  (was: Bug)

> Implement a FAIR version of the stateaware 
> ---
>
> Key: YUNIKORN-181
> URL: https://issues.apache.org/jira/browse/YUNIKORN-181
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Wilfred Spiegelenburg
>Priority: Minor
>
> In YUNIKORN-99 a new sorting policy was implemented that only allowed one app 
> in an accepted state to be scheduled.
> The apps to be scheduled are always sorted FIFO.
> We should add a FAIR version of the sorting policy too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-182) fix lint issues

2020-05-22 Thread Wilfred Spiegelenburg (Jira)
Wilfred Spiegelenburg created YUNIKORN-182:
--

 Summary: fix lint issues
 Key: YUNIKORN-182
 URL: https://issues.apache.org/jira/browse/YUNIKORN-182
 Project: Apache YuniKorn
  Issue Type: Task
  Components: build
Reporter: Wilfred Spiegelenburg


When we added the lint test most major issues were fixed. There are still a lot 
of issues specially in tests that need to be fixed.

This is a container Jira to track that work on both the k8shim as the core 
repos.

Work should be split into multiple parts (per linter?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-181) Implement a FAIR version of the stateaware

2020-05-22 Thread Wilfred Spiegelenburg (Jira)
Wilfred Spiegelenburg created YUNIKORN-181:
--

 Summary: Implement a FAIR version of the stateaware 
 Key: YUNIKORN-181
 URL: https://issues.apache.org/jira/browse/YUNIKORN-181
 Project: Apache YuniKorn
  Issue Type: Bug
  Components: core - scheduler
Reporter: Wilfred Spiegelenburg


In YUNIKORN-99 a new sorting policy was implemented that only allowed one app 
in an accepted state to be scheduled.

The apps to be scheduled are always sorted FIFO.

We should add a FAIR version of the sorting policy too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-159) Remove helm charts from the k8shim repo and update documentation accordingly

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved YUNIKORN-159.

Fix Version/s: 0.9
   Resolution: Fixed

Core changes are also pushed (PR #155)

That was the last part, closing as fixed all other documentation to follow in 
YUNIKORN-167

> Remove helm charts from the k8shim repo and update documentation accordingly
> 
>
> Key: YUNIKORN-159
> URL: https://issues.apache.org/jira/browse/YUNIKORN-159
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: build
>Reporter: Wilfred Spiegelenburg
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> After we move the helm deployment to the release repo we should remove the 
> files from the k8shim repo.
>  This should include updating the 
> [documentation|https://github.com/apache/incubator-yunikorn-core/blob/master/docs/user-guide.md#quick-start]
>  to point to the correct place to find the helm charts and explain the 
> workings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-167) Fix release tool after YUNIKORN-159

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114104#comment-17114104
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-167:


This should include documentation to be added to the core repo with the rest of 
the build info.

> Fix release tool after YUNIKORN-159
> ---
>
> Key: YUNIKORN-167
> URL: https://issues.apache.org/jira/browse/YUNIKORN-167
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: release
>Reporter: Kinga Marton
>Assignee: Kinga Marton
>Priority: Major
>
> With YUNIKORN-159 helm charts will be removed from the shim repository. We 
> will need to adjust the release tool accordingly.
> Also we will need to include the helm chart release in the release process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Comment Edited] (YUNIKORN-159) Remove helm charts from the k8shim repo and update documentation accordingly

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114099#comment-17114099
 ] 

Wilfred Spiegelenburg edited comment on YUNIKORN-159 at 5/22/20, 2:32 PM:
--

k8shim changes have been pushed PR #121


was (Author: wifreds):
k8shim changes have been pushed

> Remove helm charts from the k8shim repo and update documentation accordingly
> 
>
> Key: YUNIKORN-159
> URL: https://issues.apache.org/jira/browse/YUNIKORN-159
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: build
>Reporter: Wilfred Spiegelenburg
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
>
> After we move the helm deployment to the release repo we should remove the 
> files from the k8shim repo.
>  This should include updating the 
> [documentation|https://github.com/apache/incubator-yunikorn-core/blob/master/docs/user-guide.md#quick-start]
>  to point to the correct place to find the helm charts and explain the 
> workings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-159) Remove helm charts from the k8shim repo and update documentation accordingly

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114099#comment-17114099
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-159:


k8shim changes have been pushed

> Remove helm charts from the k8shim repo and update documentation accordingly
> 
>
> Key: YUNIKORN-159
> URL: https://issues.apache.org/jira/browse/YUNIKORN-159
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: build
>Reporter: Wilfred Spiegelenburg
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
>
> After we move the helm deployment to the release repo we should remove the 
> files from the k8shim repo.
>  This should include updating the 
> [documentation|https://github.com/apache/incubator-yunikorn-core/blob/master/docs/user-guide.md#quick-start]
>  to point to the correct place to find the helm charts and explain the 
> workings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-137) Update examples and docs to use docker images from ASF dockerhub repo

2020-05-22 Thread Kinga Marton (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kinga Marton resolved YUNIKORN-137.
---
Resolution: Fixed

> Update examples and docs to use docker images from ASF dockerhub repo
> -
>
> Key: YUNIKORN-137
> URL: https://issues.apache.org/jira/browse/YUNIKORN-137
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Kinga Marton
>Assignee: Kinga Marton
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> We have docker images pushed to ASF dockerhub repo, such as:
>  [-https://hub.docker.com/repository/docker/yunikorn/yunikorn-scheduler-k8s-]
> [https://hub.docker.com/r/apache/yunikorn]
> we need to use these images in our examples. We need to update our user guide 
> and other related documents for the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-137) Update examples and docs to use docker images from ASF dockerhub repo

2020-05-22 Thread Kinga Marton (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114066#comment-17114066
 ] 

Kinga Marton commented on YUNIKORN-137:
---

[~wilfreds], yes there are a few places in the web UI where we should change 
it. I changed those references in YUNIKORN-146

> Update examples and docs to use docker images from ASF dockerhub repo
> -
>
> Key: YUNIKORN-137
> URL: https://issues.apache.org/jira/browse/YUNIKORN-137
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Kinga Marton
>Assignee: Kinga Marton
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> We have docker images pushed to ASF dockerhub repo, such as:
>  [-https://hub.docker.com/repository/docker/yunikorn/yunikorn-scheduler-k8s-]
> [https://hub.docker.com/r/apache/yunikorn]
> we need to use these images in our examples. We need to update our user guide 
> and other related documents for the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-146) Add travis integration for yunikorn-web repo

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114058#comment-17114058
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-146:


[~akhilpb] can you please check this with us you have more experience in this 
area.

> Add travis integration for yunikorn-web repo
> 
>
> Key: YUNIKORN-146
> URL: https://issues.apache.org/jira/browse/YUNIKORN-146
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: build
>Reporter: Weiwei Yang
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-146) Add travis integration for yunikorn-web repo

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114054#comment-17114054
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-146:


Based on YUNIKORN-160 the release build fails on a new machine. Are we 
confident that this works from travis?

The other point to make is that the yarn with node code can change often 
(YUNIKORN-160). Are we going to get stable releases if we always pull the 
latest? Should we build a docker image to run the build in instead?

> Add travis integration for yunikorn-web repo
> 
>
> Key: YUNIKORN-146
> URL: https://issues.apache.org/jira/browse/YUNIKORN-146
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: build
>Reporter: Weiwei Yang
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-137) Update examples and docs to use docker images from ASF dockerhub repo

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YUNIKORN-137:
---
Fix Version/s: 0.9

Both k8shim and core changes have been committed. Please check if we need 
related changes in the web UI repo too. If not we can close this.

Thank you  for the contribution [~kmarton]

> Update examples and docs to use docker images from ASF dockerhub repo
> -
>
> Key: YUNIKORN-137
> URL: https://issues.apache.org/jira/browse/YUNIKORN-137
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Kinga Marton
>Assignee: Kinga Marton
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> We have docker images pushed to ASF dockerhub repo, such as:
>  [-https://hub.docker.com/repository/docker/yunikorn/yunikorn-scheduler-k8s-]
> [https://hub.docker.com/r/apache/yunikorn]
> we need to use these images in our examples. We need to update our user guide 
> and other related documents for the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-158) Admission controller deployment file should use the same version as the scheduler

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114024#comment-17114024
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-158:


The change that is committed via PR #117 does not work and I have reverted the 
fix.

When I build the admission controller image the docker part complains with the 
following message:
{code:java}
 ---> 557c8e8053ed
[Warning] One or more build-args [DOCKER_IMAGE_REGISTRY DOCKER_IMAGE_VERSION] 
were not consumed
Successfully built 557c8e8053ed {code}
The build finishes but this is not what it is supposed to do.

When I build the scheduler image it is still pointing to the wrong admission 
controller.

The change for the template should be in the scheduler image and use the same 
procedure with {{sed}} from the makefile instead of making the change via the 
docker file

> Admission controller deployment file should use the same version as the 
> scheduler
> -
>
> Key: YUNIKORN-158
> URL: https://issues.apache.org/jira/browse/YUNIKORN-158
> Project: Apache YuniKorn
>  Issue Type: Sub-task
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
>
> Today, the admission controller is deployed as a post-start hook in the 
> scheduler pod, and the template file has hard coded docker image name, 
> [https://github.com/apache/incubator-yunikorn-k8shim/blob/b8a4a01fa1f6149c8617c914a721296a71037736/deployments/admission-controllers/scheduler/templates/server.yaml.template#L33.]
>  We should get this fixed and make sure the admission controller version is 
> always aligned with the scheduler. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-155) data race in unit test: TestSchedulerRecoveryWhenPlacementRulesApplied

2020-05-22 Thread Kinga Marton (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113961#comment-17113961
 ] 

Kinga Marton commented on YUNIKORN-155:
---

Finally I was able to reproduce the data race with a simple unit test where in 
one go routine I set the Queue and in an another one I did some logging. The 
fix was to implement the String() method for the ApplicationInfo object and do 
the locking there as well.

Thank you [~adam.antal] for the hint!

> data race in unit test: TestSchedulerRecoveryWhenPlacementRulesApplied
> --
>
> Key: YUNIKORN-155
> URL: https://issues.apache.org/jira/browse/YUNIKORN-155
> Project: Apache YuniKorn
>  Issue Type: Test
>  Components: test - unit
>Reporter: Wilfred Spiegelenburg
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
> Attachments: data_race.txt
>
>
> Testing shows a new data race while logging the queue name for an application 
> that gets added.
> Details in the attached logs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-155) data race in unit test: TestSchedulerRecoveryWhenPlacementRulesApplied

2020-05-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YUNIKORN-155:

Labels: pull-request-available  (was: )

> data race in unit test: TestSchedulerRecoveryWhenPlacementRulesApplied
> --
>
> Key: YUNIKORN-155
> URL: https://issues.apache.org/jira/browse/YUNIKORN-155
> Project: Apache YuniKorn
>  Issue Type: Test
>  Components: test - unit
>Reporter: Wilfred Spiegelenburg
>Assignee: Kinga Marton
>Priority: Major
>  Labels: pull-request-available
> Attachments: data_race.txt
>
>
> Testing shows a new data race while logging the queue name for an application 
> that gets added.
> Details in the attached logs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-180) Make a helm chart release and upload it to helm hub

2020-05-22 Thread Kinga Marton (Jira)
Kinga Marton created YUNIKORN-180:
-

 Summary: Make a helm chart release and upload it to helm hub
 Key: YUNIKORN-180
 URL: https://issues.apache.org/jira/browse/YUNIKORN-180
 Project: Apache YuniKorn
  Issue Type: Improvement
  Components: release
Reporter: Kinga Marton
Assignee: Kinga Marton


With YUNIKORN-140 we moved the helm charts [releases 
repository|https://github.com/apache/incubator-yunikorn-release] and 
{{gh-paged}} is now also created. 
There are still some steps to be done until we can have it published in the 
helm hub:
- ask the ASF Infra team to enable github pages for this repository and set 
gh-pages branch as source
- release the chart for 0.8.0 version
- upload it to helm hub
- update the documentation
- update the website with the link for the released helm charts



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Assigned] (YUNIKORN-177) SchedulerName doesn't need to be configurable

2020-05-22 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YUNIKORN-177:
---

Assignee: Adam Antal

> SchedulerName doesn't need to be configurable
> -
>
> Key: YUNIKORN-177
> URL: https://issues.apache.org/jira/browse/YUNIKORN-177
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Adam Antal
>Priority: Minor
>
> Currently, we allow user to overwrite the schedulerName. But this is not 
> necessary, we should stick to \{{schedulerName=yunikorn}}. Lets revoke this 
> from CLI options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-175) remove memory and vcore references from resources in tests for core

2020-05-22 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113864#comment-17113864
 ] 

Adam Antal commented on YUNIKORN-175:
-

Hi [~wilfreds],
This looks like a simple issue. Mind if I take it over?

> remove memory and vcore references from resources in tests for core 
> 
>
> Key: YUNIKORN-175
> URL: https://issues.apache.org/jira/browse/YUNIKORN-175
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: core - common, test - unit
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
>
> The core is resource type agnostic.
> Lots of the core test however reference _memory_ and _vcore_ as if they were 
> pre-defined types. There is no predefined type and we should not infer that 
> there is a default type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Updated] (YUNIKORN-179) Allow changing app id generation via option

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YUNIKORN-179:
---
Target Version: 0.9

> Allow changing app id generation via option
> ---
>
> Key: YUNIKORN-179
> URL: https://issues.apache.org/jira/browse/YUNIKORN-179
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Reporter: Wilfred Spiegelenburg
>Priority: Minor
>  Labels: newbie
>
> Currently a change to the way we generate a new app ID in the admission 
> controller requires a code change. In YUNIKORN-173 we moved from a per app ID 
> to a namespace based ID. We want to support both without a code change.
> This requires a re-instate of the app ID generation code which was removed 
> and make it possible to switch between the two using an environment setting. 
> We should end up with both ways in the code.
> We should default to the namespace generated ID



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-173) Generates one default application ID per namespace in the admission controller

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved YUNIKORN-173.


Merged the change to move to just the namespace for now.

Opened two new jiras: one to fix the code assumption that the condif is 
available, second to reinstate the old behaviour and make it configurable

> Generates one default application ID per namespace in the admission controller
> --
>
> Key: YUNIKORN-173
> URL: https://issues.apache.org/jira/browse/YUNIKORN-173
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> If app doesn't explicitly specify application ID,  lets group such pods to 
> one single app per namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-178) Remove call to get config from the admission controller

2020-05-22 Thread Wilfred Spiegelenburg (Jira)
Wilfred Spiegelenburg created YUNIKORN-178:
--

 Summary: Remove call to get config from the admission controller
 Key: YUNIKORN-178
 URL: https://issues.apache.org/jira/browse/YUNIKORN-178
 Project: Apache YuniKorn
  Issue Type: Bug
  Components: shim - kubernetes
Reporter: Wilfred Spiegelenburg


The admission controller does not have a possibility to process configuration 
from anywhere.

However it does try to retrieve the scheduler name in by calling 
{{conf.GetSchedulerConf().SchedulerName}} in \{{updateSchedulerName}}. This 
will not work and should be replaced by a environment setting like what was 
done in YUNIKORN-28 for other values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org