[ 
https://issues.apache.org/jira/browse/YUNIKORN-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qi Zhu resolved YUNIKORN-1790.
------------------------------
    Fix Version/s: 1.4.0
       Resolution: Fixed

> TestTryAllocatePreemptNode: flaky test
> --------------------------------------
>
>                 Key: YUNIKORN-1790
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1790
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: test - unit
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Qi Zhu
>            Priority: Major
>              Labels: flaky-test, pull-request-available
>             Fix For: 1.4.0
>
>
> The TestTryAllocatePreemptNode fails sometimes with a different allocation 
> being preempted than the expected alloc1:
> {code:java}
> 2023-06-07T12:40:35.035+1000  INFO    objects/queue.go:139    configured 
> queue added to scheduler     {"queueName": "root"}
> 2023-06-07T12:40:35.035+1000  INFO    objects/queue.go:1190   updating root 
> queue max resources       {"current max": "nil resource", "new max": 
> "map[first:40]"}
> 2023-06-07T12:40:35.035+1000  DEBUG   objects/queue.go:850    new parent 
> queue inheriting template from parent queue  {"child queue": "root.parent", 
> "parent queue": "root"}
> 2023-06-07T12:40:35.035+1000  INFO    objects/queue.go:139    configured 
> queue added to scheduler     {"queueName": "root.parent"}
> 2023-06-07T12:40:35.036+1000  DEBUG   objects/queue.go:343    max resources 
> setting ignored: cannot set zero max resources
> 2023-06-07T12:40:35.036+1000  DEBUG   objects/queue.go:350    guaranteed 
> resources setting ignored: cannot set zero max resources
> 2023-06-07T12:40:35.036+1000  INFO    objects/queue.go:139    configured 
> queue added to scheduler     {"queueName": "root.unlimited"}
> 2023-06-07T12:40:35.036+1000  DEBUG   objects/queue.go:343    max resources 
> setting ignored: cannot set zero max resources2023-06-07T12:40:35.036+1000    
>     INFO    objects/queue.go:139    configured queue added to scheduler     
> {"queueName": "root.parent.child1"}
> 2023-06-07T12:40:35.036+1000  DEBUG   objects/queue.go:343    max resources 
> setting ignored: cannot set zero max resources
> 2023-06-07T12:40:35.036+1000  INFO    objects/queue.go:139    configured 
> queue added to scheduler     {"queueName": "root.parent.child2"}
> 2023-06-07T12:40:35.036+1000  INFO    objects/application.go:184      Unknown 
> gang scheduling style, using soft style as default      {"gang scheduling 
> style": ""}
> 2023-06-07T12:40:35.036+1000  DEBUG   objects/queue.go:1780   Unknown 
> application     {"applicationID": "app-0"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-0", "source": "New", 
> "destination": "Accepted", "event": "runApplication"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application.go:668      ask 
> added successfully to application   {"appID": "app-0", "ask": "alloc0-0", 
> "placeholder": false, "pendingDelta": "map[first:11]"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application.go:668      ask 
> added successfully to application   {"appID": "app-0", "ask": "alloc0-1", 
> "placeholder": false, "pendingDelta": "map[first:11]"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application.go:184      Unknown 
> gang scheduling style, using soft style as default      {"gang scheduling 
> style": ""}
> 2023-06-07T12:40:35.037+1000  DEBUG   objects/queue.go:1780   Unknown 
> application     {"applicationID": "app-1"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-1", "source": "New", 
> "destination": "Accepted", "event": "runApplication"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application.go:668      ask 
> added successfully to application   {"appID": "app-1", "ask": "alloc1", 
> "placeholder": false, "pendingDelta": "map[first:5]"}
> 2023-06-07T12:40:35.037+1000  INFO    objects/application.go:668      ask 
> added successfully to application   {"appID": "app-1", "ask": "alloc2", 
> "placeholder": false, "pendingDelta": "map[first:5]"}
> 2023-06-07T12:40:35.038+1000  INFO    objects/application.go:184      Unknown 
> gang scheduling style, using soft style as default      {"gang scheduling 
> style": ""}
> 2023-06-07T12:40:35.038+1000  DEBUG   objects/queue.go:1780   Unknown 
> application     {"applicationID": "app-2"}
> 2023-06-07T12:40:35.038+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-2", "source": "New", 
> "destination": "Accepted", "event": "runApplication"}
> 2023-06-07T12:40:35.038+1000  INFO    objects/application.go:668      ask 
> added successfully to application   {"appID": "app-2", "ask": "alloc3", 
> "placeholder": false, "pendingDelta": "map[first:5]"}
> 2023-06-07T12:40:35.038+1000  DEBUG   objects/application.go:339      
> Application state timer initiated       {"appID": "app-0", "state": 
> "Starting", "timeout": "5m0s"}
> 2023-06-07T12:40:35.038+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-0", "source": "Accepted", 
> "destination": "Starting", "event": "runApplication"}
> 2023-06-07T12:40:35.038+1000  DEBUG   ugm/manager.go:63       Increasing 
> resource usage       {"user": "testuser", "queue path": "root.unlimited", 
> "application": "app-0", "resource": "map[first:11]"}
> 2023-06-07T12:40:35.038+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.038+1000  DEBUG   ugm/queue_tracker.go:45 Creating queue 
> tracker object for queue {"queue": "unlimited"}
> 2023-06-07T12:40:35.038+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.038+1000  DEBUG   ugm/manager.go:265      Group tracker 
> already exists and linking (reusing) the same with application    
> {"application": "app-0", "queue path": "root.unlimited", "user": "testuser", 
> "group": "testgroup"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:45 Creating queue 
> tracker object for queue {"queue": "unlimited"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000  DEBUG   objects/application.go:388      
> Application state timer cleared {"appID": "app-0", "state": "Starting"}
> 2023-06-07T12:40:35.039+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-0", "source": "Starting", 
> "destination": "Running", "event": "runApplication"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/manager.go:63       Increasing 
> resource usage       {"user": "testuser", "queue path": "root.unlimited", 
> "application": "app-0", "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "unlimited", "application": "app-0", 
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   objects/application.go:339      
> Application state timer initiated       {"appID": "app-1", "state": 
> "Starting", "timeout": "5m0s"}
> 2023-06-07T12:40:35.040+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-1", "source": "Accepted", 
> "destination": "Starting", "event": "runApplication"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/manager.go:63       Increasing 
> resource usage       {"user": "testuser", "queue path": "root.parent.child1", 
> "application": "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.parent.child1", "application": 
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "parent.child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.parent.child1", "application": 
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "parent.child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   objects/application.go:388      
> Application state timer cleared {"appID": "app-1", "state": "Starting"}
> 2023-06-07T12:40:35.041+1000  INFO    objects/application_state.go:133        
> Application state transition    {"appID": "app-1", "source": "Starting", 
> "destination": "Running", "event": "runApplication"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/manager.go:63       Increasing 
> resource usage       {"user": "testuser", "queue path": "root.parent.child1", 
> "application": "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.parent.child1", "application": 
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "parent.child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "root.parent.child1", "application": 
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "parent.child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   ugm/queue_tracker.go:57 Increasing 
> resource usage       {"queue path": "child1", "application": "app-1", 
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000  DEBUG   objects/application.go:1454     app 
> reservation check   {"allocationKey": "alloc3", "createTime": 
> "2023-06-07T12:40:25.038+1000", "askAge": "10.003814459s", 
> "reservationDelay": "2s"}
> 2023-06-07T12:40:35.042+1000  DEBUG   objects/application.go:1454     app 
> reservation check   {"allocationKey": "alloc3", "createTime": 
> "2023-06-07T12:40:25.038+1000", "askAge": "10.003914s", "reservationDelay": 
> "2s"}
> 2023-06-07T12:40:35.042+1000  DEBUG   objects/application.go:1470     found 
> candidate node for app reservation        {"appID": "app-2", "nodeID": 
> "node1", "allocationKey": "alloc3", "reservations": 0, "pendingRepeats": 1}
> 2023-06-07T12:40:35.042+1000  DEBUG   objects/application.go:1454     app 
> reservation check   {"allocationKey": "alloc3", "createTime": 
> "2023-06-07T12:39:55.038+1000", "askAge": "40.004148584s", 
> "reservationDelay": "2s"}
> 2023-06-07T12:40:35.042+1000  DEBUG   objects/application.go:1454     app 
> reservation check   {"allocationKey": "alloc3", "createTime": 
> "2023-06-07T12:39:55.038+1000", "askAge": "40.004227875s", 
> "reservationDelay": "2s"}
> 2023-06-07T12:40:35.042+1000  DEBUG   objects/preemption.go:352       No RM 
> callback plugin registered, using first selected node for preemption      
> {"NodeID": "node2", "AllocationKey": "alloc3"}
> 2023-06-07T12:40:35.043+1000  ERROR   objects/preemption.go:517       BUG: 
> Didn't find instance type in the nodeInstanceTypeMap       {"nodeId": "node2"}
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).tryNodes
>       
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:517
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).TryPreemption
>       
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:537
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryPreemption
>       
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1327
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryAllocate
>       
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1050
> github.com/apache/yunikorn-core/pkg/scheduler/objects.TestTryAllocatePreemptNode
>       
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application_test.go:1874
> testing.tRunner       /usr/local/go/src/testing/testing.go:1576
> 2023-06-07T12:40:35.043+1000  INFO    objects/preemption.go:559       
> Preempting task {"applicationID": "app-1", "allocationKey": "alloc2", 
> "nodeID": "node2", "resources": "map[first:5]"}
> 2023-06-07T12:40:35.043+1000  INFO    objects/preemption.go:580       
> Reserving node for ask after preemption {"allocationKey": "alloc3", "nodeID": 
> "node2", "victimCount": 1}
> --- FAIL: TestTryAllocatePreemptNode (0.06s)
>     application_test.go:1876: assertion failed: expression is false: 
> alloc1.IsPreempted(): alloc1 should have been preempted {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to